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THEME 


The  pace  of  military  engagements  is  now  so  rapid  that  the  time  taken  for  target  detection 
and  identification  is  often  a  serious  impediment  and  limits  the  overall  effectiveness  of  the 
response.  The  problem  is  becoming  relatively  more  important  as  the  performance  of  conventional 
types  of  sensors  are  improved,  as  new  sensors  are  introduced,  and  as  the  virtues  of  using  sensors 
in  combination  are  exploited.  It  is  particularly  acute  in  manned  interdictor  aircraft  and  arises 
in  many  other  avionics  applications. 

Up  to  now  sensor  data  has  been  presented  to  the  operator  in  essentially  “raw”  form. 
Pressures  fqr  a  more  rapid  and  comprehensive  response  have  resulted  in  urgent  attention  to 
the  possibilities  of  aiding  the  operator  by  enhancing  visibility  of  targets  to  be  detected  and 
identified  or  by  accomplishing  some  or  all  of  the  process  automatically. 

Considerable  effort  is  being  directed  to  accomplish  this  in  many  NATO  countries.  The 
function  of  this  symposium  was  to  bring  together  experts  in  the  field  to  discuss  their 
problems  and  solutions  in  a  broader  context,  and  to  identify:  solutions  that  may  have  wider 
applications,  theoretical  and  practical  constraints  of  a  general  nature,  limits  of  performance 
that  can  reasonably  be  expected  in  the  next  generation,  and  fruitful  lines  of  research  and 
development. 
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OVERVIEW 


by 

R.  Voles 


As  the  pace  of  military  operations  increases,  so  the  time  taken  for  target  detection  and  identification  becomes 
significant  in  determining  the  overall  effectiveness  of  the  response.  The  situation  is  particularly  acute  in  the  manned 
interdictor  aircraft  where  urgent  attention  is  now  being  given  to  enhancing  the  visibility  of  targets  on  the  operators 
display.  As  the  hazards  to  manned  aircraft  mount,  there  is  a  rapidly  growing  need  for  the  functions  of  target  acquisition 
and  recognition  to  be  done  automatically. 

The  present  Conference  has  been  particularly  helpful  in  giving  the  participants  an  overview  of  the  impressive  strides 
that  are  now  being  taken  in  the  various  areas  of  this  broad  technology.  Although  much  remains  to  be  done,  the  progress 
to  date  is  very  impressive  and  the  user  can  look  forward  with  confidence  to  a  very  effective  range  of  techniques  and 
systems  becoming  available  in  the  medium  term. 

As  regards  the  longer  term,  I  shall  take  the  opportunity  to  indulge  in  a  little  speculation  into  the  problems  that 
remain  still  to  be  addressed.  To  do  this.  I  take  the  human  operator  as  the  starting  point.  The  human  brain  is  charac¬ 
terised  by  1010  neurons  each  communicating  at  an  effective  bandwidth  of  only  20  Hz  but  connected  to  as  many  as  10* 
neignbours. 

Now.  in  the  near  future,  we  shall  have  VLSI  circuits  operating  at  200  MHz  -  a  factor  of  107  times  faster  than  the 
neuron.  Although  the  neurophysiological  system  contains  an  impressive  10'°  neurons,  it  will  not  be  long  before  we  can 
match  this  figure;  with  densities  of  106  gates/chip  at  a  cost  of  £5/chip.  the  1 0 ,0  gates  would  amount  to  a  modest  £50k 
But  despite  matching  each  neuron  with  a  gate  operating  101  times  faster,  there  is  little  doubt  that  an  electronic  system 
such  as  this  would  be  an  intellectual  moron. 

Electronic  and  neurophysiological  systems  seem  to  differ  principally  in  the  following  respects: 

(a)  in  addition  to  the  electrical  process,  the  neurophysiological  system  appears  to  operate  a  (chemical)  memory 
mechanism; 

(b)  the  neuron  has  an  inherent  ability  to  take  an  arithmetic  sum  of  up  to  10*  inputs; 

(c)  the  richness  of  the  interconnections  between  the  neurons  of  the  neurophysiological  system  compared  with  the 
extreme  simplicity  of  those  in  electronic  systems. 

There  is  no  reason  why  we  should  not  emulate  the  neurophysiological  processes  in  (a)  and  (b)  with  suitable 
electronic  devices;  but  the  problem  of  connectivity  is  another  matter.  Modern  solid-state  circuits  are  essentially  two- 
dimensional  structures  so  even  when  more  connectivity  has  been  introduced  (as  in  the  case  of  the  array  processor  and  a 
custom-designed  picture-processing  chip)  the  interconnections  are  between  neighbours  in  the  same  plane  and  usually  only 
amount  to  about  10  in  toto.  Thus,  the  factor  of  perhaps  103  in  the  richness  of  the  synaptic  connectivity  of  the  neuron 
compared  with  that  of  the  solid-state  circuit  seems  to  endow  the  neurophysiological  system  with  such  a  large  difference  in 
performance  as  to  put  into  shadow  the  adverse  speed  ratio  of  107. 

These  considerations  suggest  therefore  that  connectivity  is  a  most  important  area  for  research  On  the  theoretical 
side,  it  would  be  of  very  great  value  to  understand  even  the  rudiments  of  the  principles  involved.  On  the  device  side, 
there  seems  to  be  a  tremendous  incentive  to  invent  three-dimensional  solid-state  structures  with  a  facility  for  extensive 
interconnection.  The  fact  that  optical  devices  have  an  inherent  facility  for  multiple  connectivity  suggests  that  they  may 
well  have  an  important  part  to  play  in  the  future. 

It  is  a  corollary  of  these  speculations,  and  an  exciting  theme  with  which  1  can  finish,  that  once  we  can  invent  our 
way  into  the  connectivity  problem  and  have  reached  the  levels  achieved  by  the  neurophysiological  system,  we  shall  have 
the  capability  to  make  pattern  recognition  machines  with  the  power  of  the  human  brain  yet  able  to  operate  101  times 
faster!  The  future  is  unimaginably  exciting. 
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THE  OPTICAL  CONTRAST  OF  LAND  AND  SEA  TARGETS 


by 

H.-E.  Hoffmann 

Deutsche  Forschungs-  und  Versuchsanstalt  filr  Luft-  und  Raumfahrt  e.V.  ( DFVLR ) 
Institut  filr  Physik  der  AtmosphSre 

8031  Oberpfaf fenhofen 
Germany 


Summary 

In  the  last  years  the  DFVLR  has  carried  out  field  experiments  for  determining  the  visi¬ 
bility  ranges  maximum  detection  range,  maximum  recognition  range  or  maximum  identifi¬ 
cation  range  when  observing  ground  to  air  or  air  to  ground  with  different  observation 
devices.  During  visibility  tests  in  Northern  Germany  in  summer  and  autumn  1977  the  in¬ 
herent  contrast  of  a  1.5  t  lorry  and  a  minesweeper-type  test  boat  belonging  to  the 
Bundeswehr  were  also  measured.  The  measurements  were  taken  using  a  photopic  adapted 
photometer  installed  in  a  Bell  UH  ID  helicopter.  Some  of  these  measurement  results  are 
presented  in  diagrams  and  give  information  on  the  influence  of  the  following  parameters 
on  the  inherent  contrast  of  both  the  land  respectively  sea  targets:  background  illumi¬ 
nation  of  target  area,  angle  of  elevation,  time  of  day,  parts  of  target,  degree  of 
cloud  cover,  angle  of  azimuth. 

1 .  Introduction 

In  the  last  years  field  experiments  were  carried  out  by  DFVLR  to  determine  the  ranges 
in  which  targets  in  the  air  or  on  the  ground  can  be  detected,  recognized  or  identified 
when  using  different  observation  devices.  Visibility  tests  conducted  in  North  Germany 
during  summer  and  autumn  1977,  in  the  region  of  Bundeswehr  test  centre  91  in  Meppen  and 
centre  71  in  EckernfQrde,  included  measurements  of  additional  optical  and  atmospheric- 
optical  parameters  besides  those  previously  measured.  From  the  ground  it  had  previously 
been  possible  to  obtain  individual  test  parameters,  the  horizontal  standard  visibility 
and,  if  required,  the  adaptation  light  intensity.  In  the  1977  tests,  using  a  Bell  UH  ID 
helicopter  fitted  with  a  photometer  measuring  in  the  visible  part  of  the  electromagnetic 
sprectrum,  measurements  were  also  taken  of  the  inherent  contrast  of  objects  under  ob¬ 
servation  and  of  the  sky  to  ground  light  intensity  ratio. 

After  a  short  description  of  the  measurements  in  section  2,  section  3  presents  results, 
classified  according  to  particular  test  parameters  for  the  inherent  contrast.  These 
results  should  provide  a  basis  for  some  initial  general  statements,  for  example  con¬ 
cerning  the  dependence  of  this  parameter  on  the  type  of  background,  type  of  lighting 
(shadow,  semi-shadow,  sunlight),  angle  of  elevation  at  which  the  measurements  were  taken, 
time  of  day,  angle  between  direction  of  measurement  and  sun's  radiation,  and  degree  of 
cloud  cover.  It  should  be  noted  here  that  the  number  of  individual  values  on  which  these 
statements  are  based  is  in  some  cases  very  small,  and  that  it  is  essential  to  classify 
the  measured  values  according  to  further  parameters,  e.g.  the  horizontal  standard  range 
of  visibility,  the  sun's  angle  of  elevation. 

The  values  presented  here  for  the  inherent  contrast  should  be  used  to  . rovide  a  theore¬ 
tical  interpretation  of  the  observation  results  from  [1,  2,  3,  4,  5]  by  theory  of 
Duntley  [6,  7)  or  may  be  used  as  entrance  datas  for  visibility  models.  During  the  1977 
field  tests  initial  measurements  were  taken  using  the  measuring  helicopter  to  determine 
further  optical  and  atmospheric-optical  parameters  sky-ground  ratio  coefficient  of 
reflection,  slant  standard  visibility,  atmospheric  light,  and  the  contrast  between  sky 
and  water-horizon.  Work  was  carried  out  on  processing  and  evaluation  of  these  measure¬ 
ments  1 8,  91, 

2 .  Measurement  of  the  inherent  contrast 
The  inherent  contrast  CQ 


( L  =  light  intensity  of  the  object,  Lc  =  light  intensity  of  the  backgrourd  from 
distances  nearly  0) 

was  calculated  using  light  intensities  measured  by  means  of  a  Pritchard  Fotometer  in¬ 
stalled  in  a  Bell  UH  ID  helicopter  (see  Fig ■  1 ) .  The  Fotometer  was  mounted  in  the  heli¬ 
copter  such  that  the  helicopter  vibrations  were  only  transmitted  to  the  Fotometer  in  a 
damped  form.  The  Fotometer  couid  be  rotated  through  about  40  degrees  in  the  horizontal 
plane  and  about  10  degrees  in  the  vertical  plane.  The  values  measured  by  the  Fotometer 
were  registered  either  on  an  indicator  or  by  recording  apparatus.  Values  for  the  in¬ 
herent  contrast  Involve  a  measurement  error  of  about 

+  0.02 

due  to  the  photometer.  This  error  was  determined  in  special  measurements. 


The  fairly  wide  confidence  bands  on  the  mean  values  (see  Chapter  3)  were  due  to  the 
inhomogeneities  in  light  intensities  within  the  photometered  objects  and  their  surroun¬ 
dings,  and  the  fact  that  during  measurement  the  photometer  could  not  be  fixed  constantly 
on  a  particular  point  of  the  surface  to  be  photometered. 

Measurements  were  taken  with  the  helicopter  between  those  sections  of  the  tests  in  which 
the  observation  flights  were  designed  to  determine  the  maximum  detection  range,  the 
maximum  recognition  range  or  the  maximum  identification  range  11,  2,  3,  4,  5], 

The  length  of  each  individual  measurement  flight  for  the  determination  of  the  inherent 
contrast  and  other  optical  respectively  atmospheric-optical  parameters  was  between  20 
and  45  minutes. 

A  minesweeper-type  test  boat  belonging  to  the  Navy  was  used  as  the  "measurement  object" 
on  water;  it  was  also  used  to  obtain  the  observation  values.  Fig .  2  shows  the  test  boat 
and  the  measurement  helicopter  during  an  experiment. 

A  1.5  t  army  lorry  was  used  for  the  air-ground  measurements,  and  also  to  obtain  the  ob¬ 
servation  values.  Fig ■  3  shows  this  lorry  in  one  of  its  three  positions  during  the 
Summer  1977  tests. 

3 .  Classification  of  measurement  results  according  to  particular  test  parameters 

The  test  results  are  presented  in  diagrams,  classified  according  to  particular  test 
parameters.  The  diagrams  contain  the  mean  values  with  confidence  bands  for  a  statistical 
probability  of  95  %.  The  number  of  individual  values  from  which  the  mean  values  are 
taken  is  shown  on  the  diagrams  in  brackets  after  the  mean  values. 

The  results  for  the  inherent  contrast  are  sub-divided  into  results  for  air-ground  and 
air-water  observations.  The  values  for  the  air-ground  inherent  contrast  are  valid  for 
a  1.5  t  army  lorry  (see  Fig.  3),  and  the  values  for  the  air-water  inherent  contrast 
apply  to  the  Bundeswehr  test  boat  (see  Fig.  2). 

The  diagrams  belonging  to  the  following  sections  3.1  and  3.2  of  this  paper  are  parts  of 
the  diagrams  of  [8J.  For  more  and  complete  instructions  on  the  results  of  field  experi¬ 
ments  1977  determining  values  for  the  optical  inherent  contrast  of  a  ground  and  a  sea 
target  you  must  look  into  the  diagrams  and  tables  of  this  paper. 

3 . 1  Air-ground  inherent  contrast 

The  next  five  figures  4-8  contain  values  for  the  inherent  contrast  of  the  1.5  t  lorry  as 
target  for  different  measuring  conditions:  various  backgrounds,  various  lighting  con¬ 
ditions,  various  angles  of  elevation,  and  various  angles  of  azimuth  between  the 
direction  of  measurement  and  the  sun's  radiation. 

Figure  4  gives  an  impression  of  the  influence  of  different  backgrounds  on  the  inherent 
contrast.  These  measurements  were  taken  in  front  of  "green  grass  and  the  edge  of  a 
wood",  "green  cornfield",  and  "yellow  sand".  The  time  of  measurement  was  the  middle  of 
June.  During  the  tests  of  figure  4  the  target  was  in  shadow.  Then  the  value  for  the 
inherent  contrast  for  the  "green  grass  and  edge  of  wood"  background  was  -0.28,  for  the 
"green  cornfield"  -0.46  and  for  "yellow  sand"  -0.71. 

Also  the  type  of  light  in  the  target  area  altered  the  inherent  contrast  of  the  1.5  t 
lorry.  If  this  was  in  front  of  the  wood  (see  figure  5) ,  then  the  inherent  contrast  was 
-0.28  for  shadow,  -0.33  for  semi  shadow  and  -0.42  for  sunshine. 

The  next  figure  6  shows  the  influence  of  different  angles  of  elevation  when  measuring 
the  inherent  contrast  of  this  specific  target  on  the  ground.  For  example  with  a  back¬ 
ground  of  "green  grass  and  edge  of  wood"  in  shadow,  then  the  absolute  value  of  the 
inherent  contrast  was  decreasing  with  increasing  angle  of  elevation.  At  angle  of  6°  it 
was  -0.34,  at  an  angle  of  12°  it  was  -0.29  and  at  an  angle  of  18°  it  was  -0.23. 

Only  six  individual  values  for  forming  mean  values  were  available  to  give  information 
in  which  manner  the  inherent  contrast  is  changing  when  time  of  day  and  with  that  the 
angle  between  the  direction  of  measurement  and  the  sun's  radiation  are  changing.  In  the 
specific  case  of  figure  7  -  the  angle  was  changing  from  1650  to  240°  -  the  inherent 
contrast  had  changed  from  -0.34  to  -0.18. 

When  measuring  from  an  elevation  angle  of  18°  the  specific  ground  target  used  during 
the  measurement  presented  two  surfaced  of  more  or  less  equal  area  but  very  different 
brightness  (see  figure  3).  The  values  of  the  inherent  contrast  for  the  darker  side 
panel  were  -0.29  and  for  the  brighter  roof  +0.14  (see  figure  8) . 

3 . 2  Air-sea  inherent  contrast 

When  measuring  the  inherent  contrast  of  a  minesweeper-type  test  boat  the  influence  of 
the  r3llowing  parameters  was  tried  to  determine:  various  degrees  of  water  brightness, 
various  degress  of  water  and  boat  brightness  and  various  angles  of  azimuth  between  the 
direction  of  measurement  and  the  sun's  direction. 
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When  observing  in  approximately  the  direction  of  the  sun’s  radiation  and  the  sun  is 
shining  then  direct  sun's  radiation  reflected  on  the  water  or  reflected  sky  light  only 
can  be  the  background.  The  inherent  contrast  value  was  -0.67  when  the  reflected  sky 
light  only  was  the  background  and  the  inherent  contrast  value  was  -0.93  when  the  reflec¬ 
ted  direct  sun  radiation  was  the  background  (see  f igure  9 ) .  These  values  are  only  valid 
for  the  test  conditions  belonging  to  figure  9. 

Because  of  the  dependence  of  the  reflectivity  of  water  the  brightness  of  water  back¬ 
ground  is  different  in  front  of  and  behind  a  ship.  The  influence  of  these  various  bright¬ 
nesses  on  the  inherent  contrast  shows  figure  10.  The  two  values  of  figure  10,  -0.54 
water  background  behind  the  boat  and  -0.43  water  background  in  front  of  the  bolt  -  were 
determined  when  the  angle  of  azimuth  between  the  direction  of  measurement  and  the  sun's 
radiation  was  1o1°  and  the  degree  of  cloud  cover  was  8/8. 

The  plotted  mean  values  for  the  inherent  contrast  of  figure  11  were  got  when  the  bright¬ 
nesses  of  the  water  and  of  the  boat  were  changed  by  variation  of  the  cloud  covery.  When 
the  cloud  cover  was  0/8  -  3/8  then  the  inherent  contrast  value  was  -0.72  and  at  a  cloud 
cover  degree  6/8  -  8/8  then  the  inherent  contrast  value  was  -0.59.  During  these  experi¬ 
ments  the  angle  of  azimuth  between  the  direction  of  measurement  and  the  sun's  radiation 
was  approximately  180°.  For  water  brightness  ever  reflected  sky  light  only  was  measured. 
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when  direction  of  measurement  and  direction  of  the  sun's  radiation  were  approximately 
coincident.  That  means  the  boat  was  brighter  than  its  background.  The  second  values 
represent  inherent  contrast  values  when  direction  of  measurement  and  direction  of  the 
sun's  radiation  were  approximately  opposite.  That  means  the  boat  was  darker  than  its 
background. 

With  an  angle  of  azimuth  of  approximately  0°  between  the  direction  of  measurement  and 
the  sun's  radiation  the  inherent  contrast  showed  positive  values:  +0.83  for  degree  of 
cloud  cover  between  0/8  and  3/8,  +0,76  for  degree  of  cloud  cover  between  6/8  and  8/8. 

With  an  angle  of  azimuth  of  approximately  180°  between  the  direction  of  measurement  and 
the  sun's  radiation  the  inherent  contrast  showed  negative  values:  -0.72  for  degree  of 
cloud  cover  between  0/8  and  3/8,  -0.59  for  degree  of  cloud  cover  between  6/8  and  8/8. 
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Fig.  1  Photometer  in  Bell  UH  helicopter 


Fig.  2  Minesweeper-type  test  boat  with 
measurement  helicopter 


Fig. 5  Inherent  contrast  CQ  for  various  lighting  conditions 
Target  area:  Green  grass  and  edge  of  wood. 


Fig. 6  Inherent  contrast  CQ  for  various  angles  of  elevation. 
Terget  area:  Green  grass  and  edge  of  wood  in  shadow 


Fig. 7  Inherent  contrast  CQ  for  various  times  of  day  t  and 
angles  Cf  between  the  direction  of  measurement  and 
the  sun's  radiation. 

Target  area:  Green  grass  and  edge  of  wood  in  shadow 
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Fig. 8  Inherent  contrast  CQ  for  varous  parts  of  the  I.St  lorry 

Target  area:  Green  grass  and  edge  of  wood  in  sunshine. 
Angle  of  elevation:  18° 


Fig. 9  Inherent  contrast  C  for  reflection  of  sky  light  i 

and  reflection  of  the  sun's  rays. 

Angle  of  elevation:  5° 

Degree  of  cloud  cover: 3/8  -  4/8 

Angle  of  azimuth  between  direction  of  measurement 
and  the  sun's  radiation:  160° 


m 

Behind  the  boat 

In  front  of  the  boat 

Fig. 10  Inherent  contrast  CQ  for  reflection  of  the  sky 
in  front  of  and  behind  the  boat. 


Degree  of  cloud  cover:  8/8 

Angle  of  azimuth  between  the  direction  of  measure 
ments  and  the  sun's  radiation:  101° 


Angle  of  azimuth  between  the  direction  of  measure 
ment  and  the  sun's  radiation:  180°  +  30° 


12  Inherent  contrast  CQ  for  various  angles  of  azimuth 
between  the  direction  of  measurement  and  the  sun's 
radiation. 

Degree  of  cloud  cover:  0/8  -  3/8 


3  Inherent  contrast  CQ  for  various  angles  of  azimuth 
between  the  direction  of  measurement  and  the  sun's 
radiation. 

Degree  of  cloud  cover:  6/8  -  8/8 
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AN  APPROACH  TO  AIRBORNE  INFRARED  SEARCH  SET  PERFORMANCE  MODELING 


Nancy  E.  MacMeekln 
Naval  Air  Development  Center 
Warminster,  Pennsylvania  1S974  U.S.A. 


ABSTRACT 

An  airborne  search  set  mathematical  model  is  described  which  predicts  sensor  performance  in  terms 
of  probabilities  of  detection  and  false  alarm  In  addition  to  calculating  signals  available  for  process¬ 
ing  by  the  sensor.  Inputs  and  submodels  which  describe  and/or  calculate  target  signatures,  backgrounds, 
atmospheric  effects  and  sensor  design  are  discussed.  Sensor  performance  Is  Illustrated  for  several 
atmospheric  conditions. 


INTRODUCTION 

Naval  aircraft  and  surface  vessels  are  subject  to  attack  by  high-speed  enemy  aircraft  and  missiles 
launched  from  ranges  of  the  order  of  100  kilometers.  The  task  of  detecting  and  tracking  such  threats 
at  ranges  sufficient  to  permit  neutralizing  their  effects  is  being  addressed  in  a  number  of  investiga¬ 
tions.  One  approach  is  that  of  an  Infrared  search  and  track  set,  which  could  passively  acquire  and 
track  missiles  and  aircraft  at  long  ranges. 

In  this  paper,  I  shall  describe  this  search  and  track  concept,  discuss  the  mathematical  modeling 
approach  being  pursued  at  the  Naval  Air  Development  Center  to  evaluate  this  concept,  and  present  some 
examples  of  the  results  of  the  modeling. 

CONCEPT  DEFINITION 

An  airborne  infrared  search  and  track  set  is  defined  as  an  aircraft  subsystem  which  can  passively 
detect,  localize,  and  track  a  target  within  a  prescribed  search  volume  by  sensing  the  infrared  radiation 
emitted  and  reflected  by  the  target.  Such  a  system  is  required  to  search  ^  wide  field  of  view,  often 
360°  in  azimuth,  and  to  detect  and  track  point  targets  at  ranges  of  tens  or  hundreds  of  kilometers; 
consequently,  the  target  signals  may  be  quite  weak  and  easily  lost  in  background  clutter.  This  kind  of 
set  Is  different  from  an  airborne  threat  warning  receiver  in  that  the  warning  receiver  is  designed  to 
warn  a  pilot  of  imninent  danger,  generally  from  the  rear,  within  10  kilometers,  to  which  he  must  respond 
immediately  by  evasive  maneuvering,  decoy  deployment,  or  other  countermeasures.  For  a  warning  receiver, 
then,  the  field  of  view  Is  relatively  small,  perhaps  40°  x  60°,  and  the  signal  levels  high. 

A  concept  diagram  of  a  simple  airborne  infrared  search  and  track  set  Is  shown  in  Figure  1.  A 
sensor  enclosure,  rotatable  through  360°,  houses  the  detectors,  optics,  cooler,  and  electronics  com¬ 
prising  the  sensor.  The  gimballed  sensor  is  stabilized  so  that  the  aircraft's  random  motion  does  not 
affect  the  search  of  a  given  volume  of  space.  As  the  sensor  assembly  scans  in  both  azimuth  and  eleva¬ 
tion,  Infrared  radiation  passes  through  the  protective  window  and,  by  means  of  the  optics,  is  focused 
on  the  cryogenically  cooled  detectors.  The  detectors  convert  changes  In  Incident  infrared  radiation  in 
selected  filtered  wavebands  to  signal  voltages.  The  signal  processor  in  the  sensor  enclosure  amplifies 
the  low-level  signals  from  the  detectors  and  then  multiplexes  the  signals  by  sampling  each  detector 
channel  and  combining  the  data  into  one  channel.  The  multiplexed  signal  leaves  the  rotating  sensor 
enclosure  through  slip  rings  and  enters  a  signal  processor/conditioner  where  various  techniques  can  be 
used  to  separate  the  target  from  background  clutter  caused  by  things  such  as  cloud  edges  or  sun  glints. 

A  computer  generates  synthetic  video  for  the  pilot  and  provides  tracking  information  to  the  weapons 
system. 

The  application  of  this  concept  to  naval  operations  is  Illustrated  in  Figure  2.  An  airborne 
Infrared  search  and  track  set  could  be  used  for  defending  the  fleet  against  attacking  enemy  aircraft 
and  against  anti-ship  missiles  launched  from  distances  beyond  the  ship's  horizon.  An  equipment  of  this 
type  could  also  be  used  for  aircraft  self-defense,  whether  to  alert  a  fighter  to  take  necessary  action 
or  to  alert  a  patrol  aircraft  in  time  to  enable  it  to  get  out  of  enemy  Interceptor  range.  Because  the 
infrared  search  and  track  set  Is  passive,  it  could  be  of  particular  value  in  situations  in  which  radia¬ 
tion  emission  control  has  been  mandated.  In  addition,  it  has  advantages  against  threats  of  small  radar 
cross-section  and  where  radar  janming  is  in  effect. 

MODa  DESCRIPTION 

The  objective  of  the  modeling  investigations  at  the  Naval  Air  Development  Center  is  to  explore  the 
utility  of  an  airborne  Infrared  search  and  track  set  In  meeting  operational  needs.  At  issue  is  whether 
a  system  of  desired  performance  characteristics  can  be  built  within  technological,  cost,  size  and  weight 
constraints.  The  critical  problem  seems  to  be  detecting  the  target  with  an  acceptable  false  alarm  rate, 
rather  than  tracking  It;  consequently,  the  modeling  effort  thus  far  has  not  included  the  tracking  func¬ 
tions. 

The  approach  taken  to  modeling  performance  of  airborne  infrared  search  sets  has  been  to  acquire  a 
model  that  includes  the  necessary  attributes,  but  that  is  not  too  expensive  and  complicated  to  run.  The 
source  of  our  model,  described  in  reference  1,  is  B-K  Dynamics,  Inc.  Table  I  summarizes  the  kinds  of 
data  required  to  run  the  model  and  the  calculations  produced  by  the  model. 
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TABLE  I 


AIRBORNE  INFRARED  SEARCH  SET  MODEL: 

INPUTS  AND  OUTPUTS 

Inputs 

Outputs 

Target  Signature 

Background  Characteristics 

Atmospheric  Transmission  and  Emission 
Operational  Geometry  * 

Sensor  Characteristics 

Processing  Characteristics 

Contrast  Irradiance 

Probability  of  Detection 
Probability  of  False  Alarm 

The  Infrared  signature  of  an  aircraft  or  missile  comprises  varying  amounts  of  radiation  from  the 
exhaust  plume,  radiation  from  the  aerodynamical  ly  heated  fuselage,  and  reflected  radiation.  The  propor¬ 
tion  of  each  of  these  is  determined  by  the  environmental  conditions  and  by  the  target's  configuration, 
type  of  propulsion  system  and  fuel,  thrust,  airspeed,  and  aspect  angle  -elative  to  the  sensor.  Although 
the  plume  radiation  might  be  expecteJ  to  dominate  the  signature,  the  plume  can  be  totally  or  nearly 
occluded  by  the  fuselage  of  the  target.  The  model  can  calculate  a  simple  target  signature,  or  it  can 
accept  externally  generated  target  data. 

The  backgrounds  that  are  encountered  in  open-ocean  operations  are  sun,  sea,  sky,  and  clouds.  The 
most  simple  to  consider  are  clear-sky  backgrounds,  which  can  be  described  rather  well  in  terms  of  the 
air  temperature  and  the  zenith  angle.  The  radiation  from  the  sea  is  a  function  of  its  own  thermal 
emission  and  the  reflected  radiation  from  the  sky,  both  of  which  are  dependent  upon  the  sea  state  and 
the  angle  at  which  the  sea  is  viewed.  Backgrounds  are  complicated  by  the  presence  of  sunglints  from 
the  sea,  the  presence  of  clouds  of  nonuniform  radiance,  and  perhaps  man-mede  objects  whi;h  are  not 
threats.  In  the  model,  the  background  is  described  in  terms  of  an  array  of  200  small  solid  angular 
elements,  each  of  whose  emissivity,  temperature,  and  range  from  the  sensor  may  be  specified  by  the  user. 
The  model  could  be  modified  to  accept  a  tape  of  background  data. 

The  atmosphere  absorbs  and  emits  radiation  in  amounts  that  depend  upon  the  altitude,  length,  tem¬ 
perature,  water  vapor  and  aerosol  content  of  the  path,  and  upon  the  wavelength.  In  the  model,  the 
LOWTRAN  4  atmospheric  model  developed  by  the  Air  Force  Geophysical  Laboratory  is  used  to  calculate 
atmospheric  transmission  and  emission,  as  described  in  reference  2.  We  are  trying  to  improve  the 
modeling  by  including  cloud-free  line  of  sight  data  and  by  inserting  into  the  atmospheric  model  mete¬ 
orological  data  for  specific  geographical  areas  and  seasons. 

When  the  user  has  specified  the  operational  geometry  in  terms  of  sensor  and  target  altitudes,  range, 
and  target  aspect  angle,  the  model  has  all  the  information  it  needs  to  calculate  the  irradiances  at  the 
sensor  aperture  produced  by  the  target  and  the  background. 

In  the  model,  the  sensor  is  described  in  terms  of  the  spectral  bandpasses  of  the  detectors  and  the 
filters,  the  size  of  the  aperture,  the  optical  efficiency,  the  size,  spacing,  and  material  of  the  detec¬ 
tors,  the  scan  rate,  the  sampling  time,  and  the  characteristics  of  the  preamplifiers  and  electronic 
filtering.  The  model  uses  the  spectral  response  characteristics  of  the  optical  filter  and  the  detectors 
to  Integrate  the  target  and  background  irradiances  and  calculate  the  contrast  irradiance,  that  is,  the 
difference  in  aperture  irradiance  between  wher.  s  target  is  viewed  and  when  it  is  not.  The  contrast 
irradiance  may  be  understood  as  the  signal  available  for  the  sensor  to  detect.  The  model  proceeds  to 
calculate  at  each  sampling  time  the  pcwer  incident  on  each  detector  from  which  signal  voltages  and 
voltage  variances  are  derived. 

Finally,  the  user  specifies  his/her  choice  of  threshold  processing,  and  the  model  calculates  the 
probability  of  detection  for  each  sample  when  the  target  is  in  the  scene,  and  the  probability  of  false 
ai»-,i  when  the  target  is  not  in  the  scene.  Gaussian  noise  statistics  are  assumed.  The  threshold  can  be 
specified  directly,  as  a  voltage,  or  it  can  be  calculated  by  the  model  relative  to  some  user-preselected 
multiple  of  the  sensor's  noise  equivalent  irradiance;  alternatively,  the  threshold  can  be  specified  in 
terms  of  a  desired  false  alarm  rate,  or,  if  desired,  a  specified  technique  of  adaptive  thresholding  can 
be  employed. 

EXAMPLES  OF  MODELING  RESULTS 

A  sample  problem  has  been  chosen  to  illustrate  the  use  of  the  model.  An  aircraft  is  traveling  at 
high  speed,  with  its  afterburner  on.  It  is  at  an  altitude  of  11  kilometers  and  is  being  viewed  against 
a  clear  sky  background  from  a  horizontal  aspect  angle  of  20°  relative  to  the  nose-on  direction.  The 
sensor  is  at  long  ranges,  at  an  altitude  of  5  kilometers.  It  is  assumed  that  a  waveband  in  the  3.5  to 
5.2  micrometer  region  is  of  interest. 

Figure  3  illustrates  the  spectral  radiant  Intensity  of  the  aircraft.  Figure  4  is  a  plot  of  the 
spectral  transmittance  through  a  tropical  atmosphere  for  two  different  path  lengths  between  the  two 
altitudes.  The  affect  of  carbon  dioxide  absorption  in  the  4.2  to  4.5micrometer  band  is  clearly  seen. 
Figure  5  shows,  for  a  fixed  path  length  of  100  kilometers,  the  dependence  of  the  atmospheric  trans¬ 
mittance  on  the  particular  climatic  model  used.  The  differences  would  be  larger  at  lower  altitudes. 
Figure  6  Illustrates  the  apparent  spectral  radiant  Intensity  of  the  target  seen  at  various  ranges. 

Because  most  of  the  plume  radiation  falls  in  the  region  of  strong  carbon  dioxide  absorption,  the  tar¬ 
get's  signature  Is  severely  attenuated. 
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The  contrast  Irradlance  calculated  for  the  sample  problem  Is  shown  as  a  function  of  range  In 
Figure  7  for  three  different  wavebands.  The  wavebands  selected  are  the  broad  band,  and  a  band  on 
either  side  of  the  carbon  dioxide  absorption.  The  noise  equivalent  Irradlance  (NEI)  of  an  example 
3.5  to  5.2  micrometer  sensor  is  also  shown  on  the  figure.  By  assuming  a  threshold  value  of  a  multiple 
of  NEI,  one  can  find  detection  range  by  finding  the  intersection  of  the  contrast  irradlance  for  the 
3.5  to  5.2  micrometer  waveband  with  the  threshold  value. 

In  Figure  8,  the  probability  of  detection  of  the  aircraft  at  a  range  of  100  kilometers  against  the 
clear  sky  background  and  two  probability  of  false  alarm  curves  have  been  plotted  as  functions  of  the 
threshold  setting.  One  of  the  false  alarm  curves  represents  the  probability  of  threshold  crossing  when 
only  clear  sky  is  in  the  field  of  view.  The  other  represents  the  likelihood  of  threshold  crossing  when 
a  cloud  is  in  the  field  of  view.  The  threshold  is  expressed  as  a  multiple  of  NEI.  The  optimum  thresh¬ 
old  is  sean  to  be  dependent  on  the  acceptable  probability  of  false  alarm,  in  addition  to  the  required 
probability  of  detection. 

CONCLUSION 

I  have  described  the  model  being  used  at  the  Naval  Air  Development  Center  to  calculate  the  perform¬ 
ance  of  postulated  airborne  infrared  search  sets.  Results  of  sample  calculations  have  been  presented 
to  illustrate  some  of  the  effects  of  varying  target,  atmospheric,  sensor  and  processing  characteristics. 
It  appears  that  this  model,  which  is  relatively  simple  and  inexpensive  to  run,  will  be  quite  usefu1  in 
qualitative  and  quantitative  assessments  of  the  feasibility  of  building  an  airborne  infrared  search  set 
to  meet  US  Navy  requirements. 
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Figure  1.  Concept  drawing  of  .an  infrared  search  and  track  set. 


Figure  2.  Utility  of  an  airborne  Infrared  search  and  track  set  for  naval  operations. 
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Figure  3.  Spectral  radiant  Intensity  of  an  afterburning  aircraft, 
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Figure  6.  Apparent  spectral  radiant  Intensity  of  afterburning  aircraft  as  seen  at  various  ranges 
through  a  tropical  atmosphere. 
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Figure  8.  Probabilities  of  detection  and  false  alarm  as  functions  of  threshold  setting  for  a  range 
of  100  km. 


DETECTION  ET  CLASSIFICATION  DE  CIBLES  EN  IMAGERIE  INFRA-ROUGE 


•Jean  LOUCHET 
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94114  ARCUEIL  CEDEX  ,  FRANCE 


Cet  expose  est  consacre  A  la  definition  d'une  chaine  de  traltement  automatique  des  images  infra¬ 
rouges  thermiques  pour  la  detection  et  la  classification  des  cibles  terrestres.  Dans  ces  images,  la 
non-uniformite  du  fond  et  la  multiplicite  Aventuelle  des  cibles  rendent  ndcessaires  des  traitement3  re- 
lativement  complexes,  aboutissant  A  1' extraction  des  silhouettes  des  cibles. 

Les  methodes  de  classification  des  cibles  sont  enfin  abordees. 


i 


* 

2  ^ 


INTRODUCTION 


Le  Laboratoire  de  Traitements  d' Images  a  entrepris  une  Atude  consistant  A  mettre  au  point  des  algo- 
rithmes  de  traitement  automatique  d' images  infrarouges  destines  A  la  detection  des  cibles  terrestres. 

Les  traitements  sont  effectuAs  sur  des  images  fournies  par  des  capteurs  fonctionnant  A  des  longueurs 
d'onde  voisines  de  4  ^1  et  lO  ji. 

Ces  images  different  des  images  dans  les  longueurs  d'onde  visibles,  sous  deux  aspects  : 

-  d'une  part,  leur  quality  gAnAralement  infArieure  en  resolution  et  en  rapport  signal  A  bruit,  dfle 
aux  limites  physiques  et  technologiques  des  capteurs  ; 

-  d' autre  part,  une  qualitA  essentielle  pour  les  applications  militaires,  qui  provient  de  ce  que  le 
rayonneraent  propre  des  objets  aux  temperatures  ambiantes  est,  aux  longueurs  d'onde  voisines  de  lo  ji, 
gAnAralement  supArieur  au  rayonnement  rAflAchi  :  ce  qui  fait  que  ces  images  contiennent  essentiel- 
lement  une  information  sur  la  temperature  des  diffArents  points  de  la  scAne  observAe. 

L' etude  de  ces  algorithmes  s'effectue  au  moyen  du  systAme  interactif  de  traitement  du  Laboratoire  de 
Traitement  d' Images  de  l'ETCA,  couple  A  l'ordinateur  CIMSA-MITRA  15  du  Centre  Informatique  de  l'ETCA. 

Les  procAdAs  de  traitement  prAsentAs  sont  adaptAs  aux  images  terrestres  dans  lesquelles  la  non-unifor- 
mite  du  fond  et  la  multiplicitA  Aventuelle  des  cibles  rend  nAcessaire  un  traitement  relativement  com- 
plexe.  Lr application  de  ces  mAthodes  A  la  base  de  donnAes  M  ALABAMA*”  est  donnAe  en  fin  de  rapport 
(paragraphe  4.3). 

Le  traitement  comporte  trois  parties  : 

-  Les  pr A traitements,  destinAs  A  diminuer  l'effet  de  certains  dAfauts  qui  affectent  1' image  originale 
et  qui  sont  dus  aux  imperfections  du  capteur  (chapitre  1)  ; 

-  La  dAtection  des  cibles,  consistant  A  classer  chacun  des  points  de  1' image  parrni  les  deux  catAgories  : 
fond  ou  cible  (chapitre  2)  ;  cette  Atape  du  traitement  laisse  gAnAralement  subsister  un  grand  nombre 
de  fausses  alarmes  qu'il  s'agit  de  distinguer  des  cibles  rAelles  ; 

-  1 ' Alimlnation  des  fausses  alarmes  ,  destinAe  A  affiner  le  traitement  de  l'Atape  prAcAdente  (chapitre  3). 

L 1 enchalnement  de  ces  trois  parties , 1 ' application  A  la  base  de  donnAes  "  ALABAMA  ”  et  le  bilan  sont  donnAs 
au  chapitre  4. 


*  II  s’agit  d'un  ensemble  de  43  images  infrarouges  thermiques  servant  de  rAfArence  A  plusieurs  pays  de 
l'OTAN  afin  de  pouvoir  comparer  leurs  mAthodes  de  traitement  d* images. 
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I  -  PRETRAITEMENTS 


1.1-  Niveaux  moy«n»  des  Lignes  (  figure  2) 

On  comcatt  visuellemenc  un  double  d4faut  affeccanc  chacuna  da*  image ■  »ur  l'en- 
saable  de  leur  surface  :  un  d£fauc  ooyen  local  de  niveau  sur  lea  lignes,  avec 
une  periode  de  10  lignes,  ec  sur  cheque  ligne  un  ddfaut  de  niveau  sur  lea  points, 
avec  une  periode  de  3  points  si  l'on  se  restreint  4  une  petite  partie  da  la  ligne 
(voir  figure  2) . 

Le  difaut  de  niveau  local  sur  les  lignes  esc  un  defaut  courant  sur  lea  images  in¬ 
frarouges  ;  il  est  du  au  fait  que  1 1  image  esc  obtenue  par  un  d£placement  d'une  bar¬ 
rette  (ou  d'une  matrice)  de  photodiodes  et  que  les  dynamiquea  des  differents  dCtec- 
ceurs  ne  sonc  pas  identiques. 

Une  mdthode  simple  a  ete  teatee  pour  corriger  ce  dSfaut  par  addition  d'une  cons- 
tante  calculee  pour  chacune  des  lignes  de  1 ' image 

L'hypotheae  selon  laquelle  l'erreur  sur  les  lignes  est  pu- 
rement  additive,  s'esc  moncree  trop  restrictive  :  en  effet,  on  constate  que,  si  le 
rdsulcat  esc  bon  lorsqu'on  travaille  sur  une  fenStre  de  petite  taille,  dans  le  cas 
contraire,  le  decalage  moyen  de  niveau  d'une  ligne  par  rapport  4  ses  voisines  ne 
reste  pas  constant  le  long  de  cette  ligne. 

Nous  avons  cents  de  resoudre  ce  problSme,  d'abord  par  deux  tsithodea  fondies  sur  le 
celcul  d'un  ensemble  de  fonctions  de  transcodage  :  on  a  constati  encore  une  amelio¬ 
ration  par  rapport  4  la  methods  precedence,  mais  des  dtfaucs,  visuellemant  de  meme 
nature,  subsistenc. 

La  mfehode  qui  s'esc  finalement  moncrie  la  mieux  adapcCe  est  la  suivance  . 

Soit  bD  le  point  courant,  ec  soient  les  trois  portions  de  lignes  conaicutives  de 
1' image,  concenanc  les  points  voisins  de  be  : 

ao  “l  *2  a3  *4  a5  *6  *7  “8  *9 

bo  b.  b2  .  b9 

co  C1  c2  . C9 

On  calcule  les  moyennes  : 


calculees  avec  les  cermes  a^,  b^,  c^  de  1' image  originals. 
La  nouvelle  valeur  calculSe  de  b«  est  : 


b'o  ’  bo  B ) 


La  correction  du  defaut  est  done  adaptacive  le  long  d'une  ligne. 

Cette  mSthode  donne  de  bons  risultats  ;  elle  sereit  probablement  encore  meilleure 
si  elle  prenait  en  compte  5  lignes  au  lieu  de  3,  avec  une  pondtration  appropriia, 
mais  la  temps  da  calcul  sur  M1TRA  IS  se  ressentirait  du  noafcre  d'entriae-eorcies. 
Elle  ne  tient  pas  ^ompee  de  la  "pf riodicice"  de  10  ec  peut  done  8tre  uciliete  pour 
coutes  eor-.s  d' images  prtsencanc  des  difauts  de  lignea. 


L '  op t imi  sa  t ion  pour  un  capteur  du  nombre  des  points  sur  leaquela  as  fait  la  pon- 
diration  (19  dans  I'exeo^Le  ci-uessus) ,  a  ft£  dfterminCe  ampiriquament . 


Le  bruit  ligne  presence  une  pSrrodicitS  appsrence  de  3  poincs. 


Li  encore,  cecte  "pdriode"  n'exisce  qu ' 4  petite  dchelle  et  la  mdchode  qui  coneiate 
1  csiculer  :rois  fonctions  de  transcodage  (selon  le  residu  modulo  3  de  l'abeciaae 
du  point}  ichoue  completemene. 

Ce  dSfaut  n'esc  done  paa  du  tout  de  meme  nature  que  le  d£faut  de  niveaux  moyena 
dea  lignea  d£crit  prScedenment. 

Noua  avona  etudiS  lea  cauaea  poaaiblea  de  ce  dSfaut  et  formfi  1 1  hypo t hi a e  auivante, 
qui  parait  fort  vraiaemblable,  aelon  laquelle  le  capteur,  qui  fournit  dea  images 
de  resolution  mediocre,  esc  muni  d'un  disposicif  de  traitement  du  signal  destine 
a  ameliorer  cetce  resolution,  et  qui  en  guise  d 'amelioration,  introduit  dans 
1' image  le  bruic  que  noua  chercho'  *  .orriger.  La  solution  &  notre  problems  con- 
siscera  done  4  reconsticuer  l'im.»  <"*nt  1 ' "amelioration". 


Noua  avona  done  simplement  program--  ...  "retour  4  1 '  image  originale",  (qui,  si 
elle  est  moins  nette,  esc  aussi  beaueoup  moins  bruitee) ,  par  un  flouage  tree  mode- 
re  des  lignes . 

bg  ecanc  le  point  courant,  b_|  et  b,  ses  voisins  : 


-! 


la  nouvelle  valeur  calculee  pour  bQ  esc  : 


b\ 


2h0  +  b-I  *  bl 


Ceci  revient  non  pas  exactemenc  4  un  retour  4  1' image  originale,  mais  4  un  d&floua- 
ge  moins  brutal  que  celui  qui  eat  fait  dans  le  capteur. 


2  -  DETECTION  DES  C1BLSS 


2.1  -  Introduction 


Le  but  que  noua  recherchons  est  d'obtenir,  4  parcir  de  1' image  originale,  une 
image  binaire  en  tout-ou-rien,  representant  au  mieux  la  silhouette  des  cibies. 

L'expdrience  montre  qu'un  seuillage  uniforme  effectue  sur  l'image  originale  donne 
des  resultata  mediocres,  et  que,  meme  lorsque  le  niveau  de  seuillage  est  determine 
"4  la  main"  par  un  operateur,  il  n'est  souvent  pas  possible  d'avoir  simultanement 
une  extraction  correcte  de  silhouette  et  un  caux  raisonnable  de  fausses  alarmes 
(figure  4) . 

Un  remdde  consisce  done  4  appliquer  4  l'image  un  seuillage  dont  le  niveau,  au 
lieu  d'etre  constant,  s'adapce  localement  au  niveau  moyen  de  l'image.  On  rdduira 
ainsi  le  caux  de  fausses  alarmes  et  on  ameliorera  le  compromis  entre  les  niveaux 
de  seuillage  ideaux  des  differences  regions  de  l’image. 


2.  2.  1  -  Mode  operatoire 

En  pratique,  nous  avona  choisi  une  mechode  dquivalente  4  celle-ci,  qui  consisce 
4  soustraire  4  l’image  de  depart,  sa  composante  4  basse  frequence  spatiale,  puis 
4  effeccuer  un  seuillage  conventionnel. 

2.2.  2  ”  Choix  de  la  frequence  de  coupure 

La  choix  de  1.  frequence  de  coupure  (c-'esc-i-dire  de.  la  largeur  du  flou)  nicessit* 
que  l'operaceur  rournisse  a  la  machine  une  valeur,  qui  sera  par  example  le  dia- 
mdcre  de  la  ctble,  exprimd  en  nombre  de  points  de  digitalisation.  On  constate  que 
cecte  valeur  n'a  besoii.  d'etre  connue  qu'avec  une  precision  d'un  facceur  2  ou  3 
sans  que  .en  n'affecce  sensiblement  les  resultats.  On  peut  done  ettimer  que 
l"'aide  exterreure  4  1 'algorithme"  n'est  pas  abusive,  d'autant  plus  que  la  presence 
tiaultanee  d'un  cdlemdtre  esc  probable  sur  un  systdme  ucilisant  dee  algorithms! 
complexes  comma  caux  qui  font  l'objec  de  ce  rapport. 


t 

I 
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fPour  cela, 


on  execute  la  procedure  suivante  : 


2,3  -  Seuillage  automatique 


2.3.1“  Methode  de  calcul  du  seuil  de  binarisation 

La  mechode  de  calcul  du  aeuil  que  nous  avona  mi.se  au  point  esc  la  suivante  : 

*  on  suppose  connaltre  a  priori  1 'ordre  de  grandeur  Z  de  la  taille  *  des  cibles 
(cette  information  equivaut  a  une  information  sur  la  distance  de  prise  de  vue, 
connaissant  la  distance  focale  de  l'optique  utilisee  et  le  tramage  de  la  camera 
infrarouge  utilisee  :  voir  paragraphe  2.3.3) 

-  on  particionne  1' image  en  N  carres  done  le  cote  est  egal  3  A, 

-  2  cheque  carre,  on  associe  le  niveau  de  gris  le  plus  fort  de  ce  carr6 
(e'est-a-dire  le  niveau  de  gris  du  point  le  plus  brillant) . 

On  obcienc  ainsi  un  tableau  T  de  N  nombres  compris  entre  0  et  255. 

La  methode  la  plus  simple  consiste  alors  2  choisir  pour  seuil  le  plus  petit 
element  de  T  (methode  du  "seuillage  par  minimax'.')  .  Cette  methode  fonctionne  bien 
pour  des  images  tres  simples,  comportant  une  ou  plusieurs  cibles  chaudes  sur  un 
fond  uni forme  :  en  effet,  si  nous  appelons  "point  d'alarme"  un  point  de  niveau 
superieur  au  seuil.  le  seuil  choisi  par  minimax  sera  le  seuil  le  plus  bas  tel 
qu'il  existe  au  moins  un  carre  ne  contenanc  aucun  point  d'alarme. 

Mais  dSs  que  le  fond  est  non-uniforme  (par  exemple,  sol  fonc6  et  del  Clair) 
cette  methode  echouera  car,  pour  peu  qu'il  existe  un  carre  entidrement  contenu 
dans  la  region  de  1' image  correspondant  au  sol,  le  ciel  sera  detects  comme  cible. 

Nous  avons  done  choisi  une  amelioration  de  la  methode,  fondde  sur  le  raisonnement 
suivanc  : 

-  (i  comae  suppose,  lea  carrds  ont  environ  la  meme  taille  que  les  cibles,  et  s'il 
y  a  P  cibles,  i.  y  aura  generalement  4P  carres  contenant  des  points  de  cibles, 

-  si  K  est  la  probabilite  pour  que,  dans  un  carre,  il  y  ait  une  fausse  alarms,  il 
y  aura  environ  KN  carres  contenant  des  points  de  fausses  alarmes.  (K  est  dvidem- 
mant  trSs  infdrieur  &  I), 

-  il  y  a  done,  si  l'image  est  convenablement  seuiliee,  4P  ♦  KN  carrfs  contenant  des 
points  superieurs  au  seuil. 

Le  seuil  sera  done  choisi  1  la  valeur  de  l'dldment  du  tableau  T  tel  qu'il  y  ait, 
dens  le  tableau,  N'  •  KN  +  4P  elements  superieurs  i  lui  (on  supposera  N'<  N) . 


•diaaitre  apparent,  expriaf  en  nombre  de  points  de  digitalisation. 


Lej  paramitres  P  et  K  sont  i  escimer  A  priori  ec  une  fois  pour  couces  en  fonecion 
dr*  disponibles,  do  faqon  1  obtenir  un  bon  compromis  encre  lei  differences 

images,  ec  de  faqon  que,  dans  une  image  donnee,  le  niveau  de  seuillage  eoic  affec- 
c£  le  aoim  possible  par  la  caille  de  la  fenecre  de  cravail  sur  laquelle  le  se'iil 
etc  calcule. 


Les  essais  effeccues  sur  de  nombreuses  images  provenanc  de  bases  de  donnee*  dif¬ 
ferences  one  moncre  que  les  parumecres  : 

K , P • 7 

conscicuent  un  bon  compromis. 

Dans  la  pracique,  cecce  formula  donne  pour  graphe  de  la  fonecion  :  N  — >N',  une 
droice,  ec  n'esc  valable  que  pour  les  grandes  valeurs  de  N  (sinon  on  risque 
d'obcenir  N'  >  N) . 


Afin  de  l'icendre  a  couces  les  valeurs  de  N,  nous  avons  incroduic  une  sacuracion 
par  la  formule  : 


N '  -  min  (  -j-  ♦  — 


25 


.  N-l) 


. .  ,3N  3  ON 

soie  N-N’  -  max  (-j--  gs,D 

Les  100  premieres  valeurs  de  cecce  fonecion  sonc  representSes  A  la  figure  10. 
Le  calcul  du  seuil  esc  assure  par  le  sous-pragratmue  THRESH  (voir  annexe  I). 


2.3.2*  Seuillage 

L-’  teuillage  ayanc  ere  calcule  par  la  mechode  du  paragraphe  2.4.1,  on  procede  au 
seuillage  de  l'image,  consiscanc  a  calculer  a  parcir  de  l'image  de  deparc,  une 
image  i  deux  niveaux  de  gria  : 

-  si  un  poinc  de  l'image  de  dipart  a  un  niveau  de  gris  superieur  au  seuil,  on  donne 
la  valeur  0  au  poinc  homologue  de  l'image  seuillee  , 

-  si  un  poinc  de  l'image  de  deparc  a  un  niveau  de  gris  inferieur  au  seuil,  on 
donne  la  valeur  255  au  poinc  honxjlogue  de  l'image  seuillee, 

On  obcienc  done  -  ec  e'ese  principalemenc  ici  qu' incervienc  le  fair  que  dans  les 
images  infrarouges  chermiques  les  niveaux  de  gris  des  poincs  des  cibles  sonc  plus 
sieves  en  nuyenne  que  ceux  des  points  du  fond  -  une  image  i  deux  niveaux 
(0  ec  255)  dans  laquelle  : 

-  les  points  des  cibles  sont  A  la  valeur  0, 

-  lea  poincs  du  fond  sont  a  la  valeur  255. 

Ceci  cons ci cue  le  cas  ideal  ;  dans  la  pratique  beaucoup  d' imperfections  s'intro- 
duisenc,  qui  affeccenc  la  qualite  des  silhouettes  dececcees  (crons,  cibles  eu 
morceaux) ,  ou,  beaucoup  plus  nombreuses,  qui  correspondent  &  des  fausses  alarmes. 

Leteuillage  doit  done  ecre  affini  par  une  aerie  de  traitements  destines  A  remedier 
sux  defauts  precedents. 

Ces  traicemencs  fonc  l'objet  du  chapitre  3. 

3  -  ELIMINATION  DES  FAUSSES  ALARMES 

Le  teuil  ayanc  iti  calcule  pour  chacune  des  images  obcenues  par  les  michodes  des 
paragraphes  2.2  et  2.3,  une  image  binaire  a  ete  finalemenc  obcenue.  L'iliminacion 
des  fausses  alarmes,  qui  esc  la  premiere  ecape  de  1 ’ identification,  necissice 
non  plus  un  craicemenc  ligne  par  ligne,  mais  cible  par  cible.  II  convienc  done  de 
nuairocer  ces  cibles,  qui  sonc  difinies  comme  les  ensembles  connexes  de  poincs 
d'alarme. 


3. I  •  Elimination  des  fausses  alarmes  ponctuelles  et  rebouchaee 

L'image  binarisie  par  le  seuillage  se  prisente  maincenant  sous  la  forme  suivante  : 

*  point  done  la  valeur  dans  l'ancicnne  image  itait  infirieure  au  seuil  :  255, 

-  point  dont  la  valeur  dans  l'ancienne  image  itait  supirieure  au  seuil  :  0. 


Autrsaent  dit,  les  points  du  fond  tone  au  niveau  255,  at  les  points  de  cible  au 
niveau  0  (aux  fausses  alarmes  et  aux  difauts  de  detection  prts). 
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L' image  esc  done  prete  i  subir  le  programme  dc  nuadrotation  da*  composantea 
connexee  (voir  paragraphs  3.2).  Maia,  3  ce  grade,  1'image  eoncienc  g£n£ralement 
un  grand  nombra  de  raugaea  alarmes  ponccuellgg,  qui  aeraient  normalement  iliminies 
aux  etapes  ultcrieures,  maia  qui  ralentiaeent  considerablement  l'Etape  auivance  du 
craicemenc. 

Nous  procedons  done  4  una  drape  da  filtrate  deatinee  3  aimplifier  1'image  an 
iliminant  lea  alarmes  ponctuellee  ou  quaai -pone tuel lea.  Ce  filrraga  ear  fond*  aur 
la  principe  auivanr  :  eranr  donnee  la  configurarion  de  pointa  de  1 'image  : 

a  b  c  d  e 
f  g  h  i  j 
k  1  m  n  p 

ou  (l1 image  etant  binaire)  rous  lea  poinra  valent  0  ou  255,  on  compare  la  aoome  • 

S  *  a»b+c*-d+e+f+g+i+j+k»l*m+n+p 
(le  point  central  h  eranr  excepte) 

aux  nombrea  :  255xM  et  255xM 

ou  M  et  N  8onr  lea  parametres  du  filtre  (I  i  S  <  N  <  !2) 

Ce  filrrage  a  l'avanrage  de  deformer  peu  lea  grandes  ciblea  et  d'aaaurer  cepen- 
danr  une  elimination  efficace  et  un  bon  rebouchage  des  rroua  subaiacanr  dana  lea 
ciblea,  3  condition  de  bien  choiair  lea  parametres  (voir  figure  11). 

3.  "  Criteres  d'elimination  dea  fausaes  alarmes 

3  ce  stade  du  traitement,  1'image  eat  prete  pour  le  calcul  des  parametres  qui 
concernenc  chacune  des  ciblea  detectees.  On  a  en  effet  obtenu  1'image  segmentde 
qui  contient  la  numerotation  des  ciblea,  et  on  a  evidemment  pris  aoin  de  conser- 
ver,  non  pas  exactemenc  1'image  originale,  mais  1'image  qui  resulte  de  sa  restau- 
ration  (chapitre  !)  ,  route  1 'information  que  nous  avons  perdue  dans  la  compres¬ 
sion  d' inf ormation  que  conatitue  la  detection  des  silhouettes  des  cibles,  est  en 
fait  encore  disponible. 

Nous  allons  done  pouvoir  maintenant  faire  intervenir  certains  critdrea  dont, 
jusqu'ici,  nous  n'avons  pas  pu  tenir  compte. 


3.1  -  Criteres  de  forme 

Cea  critdres  font  intervenir  uniquament  la  forme  du  contour  de  l'objet  ditecte. 

Lea  caa  les  plus  frequents  de  fausaes  alarmes  dont  1' elimination  relSve  aimple- 
ment  dea  criteres  de  contours  sont  : 

a)  -  Objet  entierement  contenu  dans  deux  lignes  d'image  ou  deux  colonnea  d'image 

consecutives , 

b)  -  Objet  peu  dense  ou  tree  ellonge, 

c)  -  Objet  occupant  un  trls  grand  nombre  de  lignes  ou  de  colonnes  (plus  de  4  fois 

la  taille  a-priori). 

3.2  -  Criteres  de  frontiere 

Apres  1' elimination  prScddente,  qui  permet  de  supprimer  les  fausaes  alarmes 
"evidentes",  restent  le  plus  douvent  des  fausses  alarmes  dont  la  forme  et  lea 
dimensions  ne  permettent  plua  de  les  distinguer  des  cibles  reelles.  C'est  13  que 
nous  pouvons  faire  intervenir  un  critere  complementaire  des  precedents,  plus 
lourd  mais  tres  puissant ,faisant  appel  aux  donnees  de  1'image  originale.  Le  prin¬ 
cipe  en  est  le  auivant  :  one  fauase  alarme  est  un  objet  detecte  par  erreur  par  le 
seuillage,  ec  dont  les  bords  n'onc  aucune  signification  semantique,  alors  que  le 
contour  detecte  d'une  cible  eat  conatitue  de  points  du  bord  de  la  cible,  done, 
de  points  ou  le  contraste  local  eat  fort  dans  1'image  originale. 

Dans  1'image  segment&e,  on  ne  voit  aucune  difference  entre  ces  deux  types  de 
contours  ;  si  l'on  remonte  a  1'image  originale,  on  reconnaitra  une  cible  d'une 
fauase  alarme  a  ce  que  la  valeur  moyenne  du  gradient  sur  sen  contour  eat  tree 
forte. 

On  calcule  done  une  image  de  gradients  de  1'image  originale  (on  choisit  un  "gra¬ 
dient  de  frontiJre"  tel  qu'il  est  defini  dans  fl]  ,  annexe  II),  puis  pour  cheque 
objet,  les  parametres  : 

-  IPERI,  pirimitre  da  l'objet 

-  SGRAD,  integrals  du  gradient  la  long  du  contour  de  l'objet 

-  ICONT  ■  SGRAD/IPERI  •  moyenne  du  gradient  sur  le  contour  de  l'objet. 


s 
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L‘ algorithms  peut  gtre  resume  par  le  schema  suivant  : 


Une  {Cap*  preliminaire  de  "decadrage  ec  creation  d'une  marge"  a  ece  ajoucee  en 
dibut  d’ enchainment  afia  de  rendre  possible  le  calcul  du  filcrage  spatial  passe- 
baa  (flou,  voir  paragraphs  2.3.2),  sans  avoir  d'effet  de  bord  (voir  figure  13). 

Ella  consists  4  encourer  1' image  origioale  d'une  marge,  de  largeur  au  moins  egale  4 
la  largaur  du  flou  1  calculer,  done  la  valeur  de  chaque  point  est  obtenue  par  reco¬ 
pie  du  point  correspondent  situi  ear  le  bord  de  1' image  originals. 

MaitUam 

be  programme  de  detection  dee  cibles  et  d' elimination  dee  fausses  alarmes  a  £t£ 
tests  sur  las  images  n®  I  4  20  de  la  base  de  donnees  "Alabama".  Lee  rSsultats  sont 
les  suivante  : 

-  Sur  39  cibles,  S  n'ont  pas  StS  detectees  et  2  fausses  alarmes  se  sont  introdu¬ 
ces,  ce  qui  correspond  4  : 


taux  de  detection  :  87  X 
taux  de  feusees  alarmes  :  5  X 

4  condition  de  considfrer  comae  corrects  la  detection  dee  cibles  qui  se  recou- 
vrent. 

-  Cette  stria  no  comports  qus  trois  images  dans  la  bands  4  microns,  ce  qui  est 
peu  significetif ,  mais  on  raaarque  qua,  sur  ces  images,  sur  6  cibles,  4  ont  4tt 
ditectCes,  2  pardues  et  I  fausse  alarms  s’sst  declares. 


I 
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5  -  CLASSIFICATION 

Une  mithoda  de  classification  par corrAlation  avec  apprantiaaaga  aat  an  coura 
d'dtude.  L'algorichme  d' apprentissage ,  decrit  plus  baa,  conaiate  A  ddterminer 
la  meillaure  approximation  du  paramdtre  da  raconnaiaaanca  (par  exeaple,objet  de  Type  A  -o, 
objet  de  type  b  »  +1  )  coma  combinaiaon  liniaira  dea  paramitres  difiniaaant 
l’iaaga  1  i'tudier.  Cat  algorithme  a  et i  teste  aur  dea  imagea  aimplea  at  fonc- 
tionne  da  faqon  aatiafaiaante  ;  son  application  atant  pour  daa  raiaona  matdrisl- 
laa  limit 6a  aux  images  ne  comportant  pas  plus  d'une  trentaine  de  points ,  il  aat 
necaaaaira  da  comprimer  1 ' information  lies  A  una  cible  1  une  trentaine  da  para- 
mitres,  at  ceci  de  faqon  stable  par  rapport  aux  variations  dues  aux  changements 
daa  conditions  da  prise  de  vua  (angle  de  prise  de  vue,  etc...).  Nous  n'avona  pas 
ancora  itudii  la  faqon  da  comprimer  catte  information  ;  il  aemble  qu'un  codage 
du  contour  constitue  la  oeilleure  voie,  bien  qua  d'autras  mithodea  aoiant  anvi- 
aagaablaa. 


I 


Principe  de  l'algorithme  d'apprentissage 

Une  foia  choisie  une  methode  de  compression  da  1 ' information  permattant  d'asso- 
cier  n  paraaetres  descriptifs  A  chaque  image  de  cible,  il  convient  de  trouver 
une  methoda  qui  permette  de  classifio-la  cible  decrite  par  ces  n  paraaetres  qua 
l'on  nonane  "paraaetres  objectifs"  L  identification  se  tera  par  calcul  de  p  pa¬ 
raaetres  (du  type  :  ixi  -  0  si  objet  du  type  A 

)r;  »  1  si  cMci-.  du  type  B,  etc) 

Ces  p  paraaetres  se  nomaent  "paraaetres  subjectifs". 

Le  problima  conaiate  A  exprimer  les  p  paraaetres  subjectifs  en  fonction  des  n 
parametres  objectifs.  La  solution  donnee  ici  permet  de  calculer  les  coefficients 
de  l'application  lineaire  repcndant  le  mieux  A  la  question. 

Pour  cela,  l'operateur  doit  choisir  une  aerie  representative  d' images  d'entraine- 
ment  dont  il  connait  la  verite  terrain. 

Pour  chacune  des  cibles  de  ces  images,  l'ordinateur  calculera  les  n  parametres 
objectifs  selon  la  methode  de  compression  choisie,  et  l'operateur  fournira  lui- 
aeme  las  p  paraadtree  subjectifs  decrivant  la  verity  terrain. 

Le  rSle  de  l'algorithme  d ' apprentissage  est  de  calculer,  A  partir  de  ces  donnees 
objectives  et  subjectives,  qui  seront  gdndralement  correlees,  les  n  x  p  coeffi¬ 
cients  deaandes. 


Algorithme  d ' appren t i s sage 

Soit  une  famille  de  m  objets  representes  par  n  parametres  .  A  chacun  de 
ces  objets  le  professeur  attribue  p  parametres  representant  la  classe  a 
laquelle  cet  objet  appartient.  Le  probleme  de  1  '  appren t i s sage  consiste  a 
trouver  la  meilleure  approximation  des  p  parametres  "  subjectifs"  cone 
combinaison  lineaire  des  n  parametres  objectifs  :  ceci  revient  A  trouver 
le  correlateur  lineaire  qui  ,  applique  a  l'objet  represente  par  ses  n 
parametres,  determine  la  classe  de  l'objet. 


Soisnt  jxi|  les  parametres  objectifs 

1  'i  -  I,  n 


les  parametres  subjectifs 

i  “  n  ♦  I ,  n  +  p 


Soit  M  la  metrics  do  covariance  des  (n  ♦  p)  paraaetres  centres. 
Ho tan t  »i  -  -  E  (i;)  on  a  : 


H  (*i  ou  E  est  l'esperance  mathematique . 


Clasaant  les  valours  proprss  do  dans  l'ordrs  ddcrolssant,  on  obtient 
les  vecteurs  proprsi  Vj  (i  •  I,  n  ‘  p)  de  ooordonniss  (V^j)  (j  -  I ,  n  +  p) 


t, 

I 


t 
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Pour  Its  p  dornioro  too  tour*  propre*,  Xj  e*t  potit  i  on  pout  dono  doriro  i 

r  -v. 

t  -  n  ♦  I.  n  *  p  :  >Ai  -  o  V£j  *j}  JO  oa  r  eiC  r4c„t.type. 

P*r  consequent,  ol  l'on  dcrit  lo*  p  equation*  (l  »  n  ♦  1 ,  n  ♦  p)  t 

n  n»p  .. 

.r,  Vij  *  J*  .  1  vij  *j  *  0 

J-l  J*n*l 


Loo  oolutiono  xj  ooront  dao  approximation*  do*  Xj  (j  •  d  «  I  1  g  »  p  J 


On  obtient 

done  le 

syst£me  de 

p  Equations 

lineaires  ^  p  inconnues  x* 

n*p 

n 

% 

(i  ■  n  ♦ 

t,  n  ♦  p) 

. E ,  vu 

j«n+l 

*j 

-  -  E 
j-l 

vij 

xj 

dont  In  resolution  donner*  pour  cheque  valour  1  *  n  ♦  1,  . n  +  p  i 

..  n 

(i  -  n  ♦  1.  n  ♦  p)  x'l  -  I  xj 


ddfiniaaant  lo  corrdlatour  ideal. 


Applique  k  un  ob  jet  S  quelconque,  co  corrdlatour  fournit  loo  p  par&metres. 


*i  (S) 


n 

t 


"1J 


•V. 


(S), 


prediction*  do*  paramktre*  *ubjectif«,  on  fonction  de*  paranktre*  objectifs 
eontrd*  (Xj  (S))j.|>n 

H'J*  Contrairement  k  o*  quo  l'on  oboerve  1*  plua  souvent,  noua  n'avon* 
pa*  ddaontrd  1 'existence  d 'us  corrdlateur  optimal  male,  co  qui  eat  plus 
•tile,  nouo  avon*  donnd  uao  method*  de  calcul  d'un  corrdlateur.  Coci  etant, 
on  no  pout  parlor  d'optinalitd  quo  dan*  lo  neoure  ok  on  Be  l'appliquo  qu'aux 
objeta  ay ant  aorvi  k  1 'appr»nti*«ago . 

La  riuatit*  de  1*  method*  sera  done  like  : 

-  d'un*  part,  i  la  representatives  d.  l’echantillon  choi.i  pour  I'appreneia.age; 
d  autre  part,  1  la  bonna  adaptation  de  la  compression  d' information  utilise* 
pour  ca’.tuler  las  n  paramStre*  ob;*cti£s.  Une  mesure  de  cett*  bonne  adaptation 
sera  1*  degri  de  correlation  entra  paramStres  objectifs  et  subjectifs,  que  l'on 
peut  mesurer  par  la  bonne  ou  mauvaise  dScroissance  das  valour*  propres. 
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5  -  CONCLUSION 

Cette  partie  de  I'etude  a  permis  de  oettre  au  point  un  algorithms  da  detection 
automacique  des  ciblee  et  de  leure  contours,  les  essais  effectues  eur  la  base  de 
donnees  "Alabama"  ont  montre  qu'il  fonctionnait  de  faqon  satisfaiaante  dans  la 
majorice  des  cae,  puisque  l'on  atteint  un  taux  de  detection  de  67  Z  pour  un  taux 
de  fau sees  alarmes  de  5  Z. 

L'etape  suivante  de  I'etude,  conduiaant  i  la  classification  automatique  des  cibles, 
esc  actuellemenc  en  cours. 


ANNEXE  -  ELIMINATION  DES  FAUSSES  ALARMES 


00013 


Valeur  maximale  du  gradient  sur  l'image 
des  cibles  (voir  paragraphe  2.3.3) 


MAXGRAD  (si  1CONT  <  MAXGRAD/3,  cible  rejetJe) 
ici,  MAXGRAD/3  -  IS 
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RE AL-TI ME  GREY-LEVEL  HISTOGRAM  MANIPULATION 


L.H.  Guildford 

Philips  Research  Laboratories 
Redhill,  Surrey  RH1  5HA, 
England . 

SUMMARY 


This  paper  considers  the  problem  of  how  best  to  match  the  characteristics  of  a  video 
signal  of  high  dynamic  range  to  those  of  a  display.  In  particular,  a  problem  often 
encountered  in  surveillance  systems,  that  of  adequately  depicting  small,  but  often 
important  contrast  changes  within  regions  of  widely  differing  mean  brightness  levels. 
Situations  can  arise  where  any  change  in  overall  brightness  or  contrast  needed  to  reveal 
a  particular  object,  if  its  presence  were  even  known,  would  exclude  details  from  other 
important  areas  of  the  scene.  Details  necessary  for  identification  or  even  detection  of 
the  object  mostly  exist  in  areas  of  low  contrast.  The  problem  is  to  display  and  usefully 
observe  a  wide  dynamic  range  of  information  by  means  of  a  display  system  whose  operative 
grey-scale  range  is  but  a  fraction  of  that  apparently  necessary  to  present  the  scene. 

Imaging  systems  operating  in  the  visible,  IR,  X-ray  and  ultrasound  regions  of  the  spectrum 
fall  into  this  category. 

Published  literature  concerning  the  visibility  of  experimental  objects  against  differing 
backgrounds  is  reviewed  and  interpreted  in  a  manner  whi.ch  allows  an  estimate  to  be  made 
of  the  number  of  grey-levels  required  to  represent  objec’ts  of  varying  sizes  adequately 
on  display. 

A  solution  to  the  problem  is  proposed,  which  involves  the  application  of  real-time  histogram 
modification  techniques  to  a  selected  sub-area  (Keyhole)  within  the  image,  combined  with 
overall  2-D  edge  enhancement.  Our  ACE  processor,  which  has  allowed  us  to  implement  this 
algorithm  and  investigate  the  subjective  effects  of  real-time  operation  is  described  and 
some  pictorial  results  are  presented.  It  has  been  found  that  this  operator  gives  promising 
results  especially  where  the  next  result  is  a  local  adaptive  contrast  stretch. 

1.  INTRODUCTION 

Effective  surveillance  of  a  scene  and  quick  identification  of  an  object  of  interest  require 
that  the  information  presented  to  an  observer  has  to  be  carefully  matched  to  the  dynamic 
range  of  the  display  used  and  has  to  be  enhanced  so  that  it  may  be  rapidly  assimilated  by 
eye.  Consideration  has  to  be  given  not  only  to  the  limited  ability  of  the  eye  to-identify 
vehicles  and  other  objects  described  by  less  than  13  lines  (Scott,  F,  1970,  Hollanda,  1970), 
but  also  to  the  inability  of  a  display  to  present  sufficient  perceptible  grey-levels  to 
cover  the  range  of  detectable  signal-levels  produced  by  many  imaging  systems. 

In  this  paper  we  will  be  briefly  considering  the  interaction  of  these  various  effects.  A 
description  will  then  be  given  of  an  experimental  system  which  has  enabled  us  to  investigate 
both  contrast  modification  and  edge  enhancement  techniques  applied  individually  or  in 
parallel  to  real-time  video  signals.  Experimental  results  are  then  presented  and  discussed. 

In  general  we  have  restricted  our  activities  to  real-time  processing.  That  is  to  say  that 
our  processing  operators  either  work  at  the  scanned  pixel  rates  or  else  have  completed  their 
modifications  within  one  frame  period.  Other  constraints,  such  as  the  need  to  consider 
portability,  low  power  and  cost  effectiveness  have  also  influenced  our  work.  As  a  result 
we  have  concentrated  our  attentions  on  families  of  local  area  processing  operators. 

As  stated  above,  our  main  objective  has  been  to  bring  to  the  observer's  attention  the 
information  in  the  scene  that  was  of  immediate  interest  to  him.  In  general,  such  information 
is  characterised  by  low  amplitude,  high  spatial  frequency  signals,  often  situated  in  areas 
of  poor  local  contrast. 

2.  THE  RELATIONSHIP  BETWEEN  OBJECT  SIZE  AND  GREY-SCALE  VISIBILITY 

The  problem  of  how  best  to  display  an  image  in  order  that  the  minimum  of  available 
information  is  lost  is  a  complex  one,  involving  consideration  of  both  the  characteristics 
of  the  display  and  of  the  human  visual  chain.  Figure  1  shows  an  image  that  exemplifies 
many  of  the  problems.  Overall,  the  image  has  a  wide  dynamic  range.  However  within  this 
wide  range,  the  objects  of  interest  have  only  small  contrast  differences  with  respect  to 
their  local  background.  It  is  therefore  impossible  to  display  adequately  both  these  small, 
but  important  local  variations  and  the  complete  image.  This  type  of  dilemma  is  quite 
common  -  Figure  6  shows  some  more  images  which  suffer  from  similar  problems.  In  the  region 
containing  the  hidden  tank  there  is  little  detail  available.  The  contour  image,  which 
contains  only  edge  information,  does  not  indicate  its  presence  until  the  gain  has  been 
increased  by  a  factor  of  16 .  Such  high  contrast  gains  are  incompatible  with  the 
satisfactory  presentation  of  the  complete  picture. 

One  of  the  root  causes  of  this  problem  is  the  mismatch  between  the  dynamic  range  of  the 
video  signals  and  that  of  normal  CRT  displays.  The  chart  in  Figure  2  provides  a  simple 
comparison  -  in  terms  of  detectable,  noise  limited,  signal  levels  -  between  two  hypo¬ 
thetical  IR  sensors,  normalised  to  discriminate  0.1°C,  and  the  output  from  a  high  quality 
TV  imager.  To  do  full  justice  to  the  incoming  information  a  display  would  have  to  be 
capable  of  reproducing  in  excess  of  300  discernible  grey-levels  for  the  TV  and  well  in 
excess  of  *100  levels  for  the  IR  systems,  to  do  full  justice  to  the  incoming  information. 


i 
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However,  this  is  too  simplistic  a  view,  since  it  has  been  found  that  the  range  of 
'discernible  grey-levels'  observable  on  a  display  of  any  description  is  also  dependent 
upon  the  angle  that  the  objects  of  interest  subtend  at  the  eye. 

When  one  considers  the  theoretical  dynamic  range  of  a  display,  one  must  bear  in  mind  not 
only  the  above  effect,  but  also  the  ambient  lighting,  background  luminance,  screen 
reflectance,  flare  and  luminance  range  of  the  display.  Figure  3  shows  a  series  of  curves 
which  emphasise  the  effect  of  object  subtense  on  the  'Just  perceptible  relative  intensity' 
over  a  range  of  luminance  values.  Note  that  the  lower  luminance  threshold,  the  'luminance 
of  subjective  black',  is  itself  a  function  of  object  subtense.  These  curves  have  been 
compiled  from  data  and  information  given  in  reference  books  and  reports  which  cover  the 
subjective  effects  of  various  aspects  of  displays  (Lowry,  E.M.,  1951,  Mcllwain,  K.,1956, 
Robinson,  R.N.,  1972).  The  information  contained  in  Figure  3  has  been  transcribed  in 
Figure  9  in  order  to  relate  the  number  of  discernible  grey-levels  to  the  subtense  of  the 
object.  The  curve  relates  to  an  ambient  illumination  of  16  lm.m-2  and  takes  into  account 
the  variation  in  the  luminance  of  subjective  black  and  the  background  luminance  of  the  CRT 
(reflectance  +  flare).  All  the  photometric  measurements  were  undertaken  using  controlled 
illumination  and  surfaces  of  known  size  and  reflectivity.  Only  the  background  luminance 
measurements  involved  the  use  of  a  CRT.  These  measurements  included  optical  and  electron 
flare  and  the  reflectance  of  the  ambient  illumination  (16  lm.m-2)  at  the  faceplate.  However 
effects  due  to  line  structure  and  noise  in  the  luminance  signal  were  not  investigated  by 
these  authors.  The  latter  effect  would  tend  to  smooth  out  the  difference  between  the 
original  set  of  'Just  perceptible  relative  intensity'  readings,  therefore  the  value  of 
A8/B  would  have  to  increase  to  make  the  change  perceptible  with  noise  superimposed,  thus 
further  reducing  the  number  of  discernible  grey-levels. 

These  results,  which,  as  we  have  seen,  have  been  extrapolated  from  static  measurements 
made  under  controlled  conditions,  give  an  arguably  high  estimate  for  the  observable  dynamic 
range  of  grey-levels  on  a  display.  However,  they  appear  to  agree  broadly  (i.e.  within  a 
factor  of  2)  with  our  own  subjective  estimates. 

As  we  have  already  seen,  much  of  the  important  information  in  a  scene  consists  of  small 
local  variations  superimposed  on  areas  of  often  widely  differing  luminance.  Our  approach 
to  the  problem  of  displaying  such  images  is  to  use  statistical  information  from  a  given 
area  in  a  scene  to  enable  us  to  calculate  the  video  transfer  function  which  best  matches  it 
to  the  full  dynamic  range  of  the  display.  The  algorithm  used  is  based  upon  grey-level 
histogram  modification  (Rosenfeld,  A.,  1976,  Hummel,  R.,  1977,  Ketcham,  D.J.,  1976,  Kruger, 
R.P.,  1971).  We  have  also  found  that  the  results  of  this  adaptive  contrast  modification 
technique  can  be  further  improved  by  simultaneous  2-dimensional  edge  enhancement  (Rosenfeld, 
A.,  1976,  Levi,  L.,  1976,  Schreiber,  W.F.,  1970).  In  our  system  these  two  operations  are 
performed  in  parallel  on  the  separated  edge  and  brightness  signals,  which  are  then 
recombined  after  processing  to  give  the  final  output  signal. 


3.  EXPERIMENTAL  HARDWARE 

In  order  that  the  subjective  effects  of  our  proposed  algorithms  can  be  investigated  in 
real-time,  we  have  constructed  a  microprocessor  controlled  real-time  image  processor, 
which  we  call  ACE  (Adaptive  Contrast  and  Edge  processor),  which  is  capable  of  operating  on 
8-bit  digitized  video  signals  at  up  to  15  mega  pixels. s"l,  the  more  detailed  block  diagram 
is  given  in  Figure  5. 


Figure  6  however  shows  a  much  simplified  block  diagram  of  the  ACE  system,  which  is  quite 
flexible  and  allows  us  to  perform,  in  parallel,  processing  operations  on  both  brightness 
and  edge  information  in  a  scene. 

These  operations  include: 


Linear  contrast  stretch  ) 
Gamma  correction  ) 
Thresholding  ) 
Histogram  equalisation/modification  ) 


Brightness  processing 


2-dimensional  edge  enhancement  ) 

Adaptive  spatial  filtering  )  Edge  processing 

Adaptive  noise  thresholding  ) 


Operator  interaction  is  via  a  keyboard  and  X-Y  oscilloscope  display,  which  is  used  to 
present  plots  of  system  transfer  functions,  input  or  output  histograms  and  other  relevant 
information. 


High  through-put  and  adaptivity  at  both  pixel  and  frame  rates  are  achieved  by  the  use  of 
parallel  RAM  look-up  tables  as  the  processing  elements  in  each  of  the  6  main  signal  paths  - 
brightness,  point,  horizontal,  vertical  and  2  diagonal  edge  signals,  (Figure  5).  Selection 
between  the  4  parallel  RAMs  in  each  of  the  signal  paths  can  be  made  at  the  pixel  rate, 
providing  limited  real-time  adaptivity,  while  the  RAMs  themselves  are  loaded  by  the  system 
microprocessor,  thus  giving  further  adaptivity  but  in  this  case  at  frame  rate.  Other  units 
ancillary  to  the  basic  processing  elements  are  the  input  matrix  store  and  point/edge 
generator  which  furnish  the  6  input  signals  mentioned  above,  the  path  control  board,  which 
translates  microprocessor  mode  commands  into  various  path  selection  strategies  for  the 
processed  signals  at  the  pixel  rate  and  finally,  a  histogram  counter,  which  provides  the 
necessary  statistical  information  concerning  the  intensity  distribution  of  the  input 
brightness  signal  (i.e.  the  mean  of  a  3x3  pixel  block  surrounding  the  current  processed 
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point).  At  the  output  of  the  system,  the  processed  line  or  point  and  brightness  signals 
are  recombinded  to  give  the  complete  output  signal. 

External  to  the  ACE  processor,  other  units  are  required  to  support  it.  These  include  an 
ADC  and  DAC,  a  video  sync  stripper,  a  clock  generator  and  a  Keyhole  generator.  The  use 
of  an  accurate  sync  stripper,  black-level  d.c.  clamp  and  clock  pulse  generator  is  important. 
Sync  signals  within  the  processor  will  only  degrade  its  performance,  using  up  unnecessary 
bits  in  the  digitized  pixel  word.  A  black-level  d.c.  clamp  samples  the  back  porch  of  the 
video' waveform  and  maintains  a  known  d.c.  reference  throughout  the  line  scan.  Other  points 
on  the  waveform  may  be  sampled  for  reference  purpose  if  required.  The  clock  generator  is 
asychronous  and  is  capable  of  locking  to  a  sync  pulse  on  a  line  by  line  basis,  so  that  the 
vertical  justification  of  the  pixel  samples  is  maintained  to  within  a  fraction  of  a  pixel 
period.  Flywheel  synchronisation  circuits  are  of  little  use  when  trying  to  synchronise  to 
the  non-broadcast  standard  sync  signals  provided  by  imagers  using  opto-mechanical  scanning 
systems,  cheap  TV  cameras  and  video  tape  recorders.  The  Keyhole  generator  provides  flags 
which  delineate  a  variable  size  keyhole,  which  may  be  positioned  anywhere  within  the  frame. 


It  is  now  possible  to  see  how  a  complicated  algorithm  such  as  his 
implemented  on  this  system.  At  the  end  of  a  frame  period,  the  mi 
histogram  counter,  which  contains  a  count  of  the  number  of  pixels 
levels  within  the  current  Keyhole  area.  This  information  is  used 
transfer  function  that  will  give  the  best  approximation  to  the  de 
This  transfer  function  is  then  loaded  into  one  of  the  brightness 
the  path  select  logic  set  so  that  this  RAM  is  selected  during  the 
coded  RAM  being  selected  when  outside  the  Keyhole. 


togram  modification  may  be 
croprocessor  accesses  the 
in  each  of  the  256  input 
to  determine  the  intensity 
sired  output  histogram, 
look-up  table  RAMs,  and 
Keyhole  period,  a  linearly 


It  is  now  clear  why  we  state  that  edge  enhancement  is  performed  in  parallel  with  histogram 
modification,  since  the  former  algorithm  requires  access  to  the  point  difference  signal, 
while  the  latter  requires  the  brightness  (i.e.  mean)  signal.  In  the  ACE  system  these 
signals  are  indeed  separate  and  are  modified  in  parallel  before  recombining. 

it.  EXPERIMENTAL  RESULTS 

The  pictures  in  this  article  have  been  derived  from  TV  cameras  or  slide  scanners  viewing 
either  a  diorama  or  transparencies.  In  the  case  of  the  fish  head  (a  halibut)  an  x-ray 
transparency  was  used.  This  x-ray  was  chosen  for  its  wide  range  of  grey-scales  and  fine 
detail  which  serve  to  illustrate  the  points  we  wish  to  emphasise. 

Some  idea  of  the  subjective  effect  of  grey-level  histogram  equalisation  may  be  obtained 
from  the  rather  old  picture  of  a  256  step  grey-level  wedge  shown  in  Figure  7  (Guildford, 
L.H.,  1978/79).  The  upper  histogram  shows  that  within  the  Keyhole,  the  scene  intensities 
span  only  one  quarter  of  the  total  available  brightness  range.  The  lower  histogram  shows 
that  after  processing,  the  pixels  within  the  Keyhole  now  span  the  entire  intensity  range 
from  black  to  white,  a  contrast  expansion  of  4:1  in  this  instance.  ACE  implements  a 
'direct'  histogram  modification  algorithm  in  that  a  count  is  made  over  all  256  grey-levels 
of  the  input  signal  instead  of  only  the  3?  output  levels.  This  enables  the  microprocessor 
to  calculate  the  optimal  transfer  function  immediately  without  recourse  to  any  iterative 

techniques.  The  modified  image  thus  tracks  any  changes  of  the  input  scene  on  a  frame  to 

frame  basis. 

To  illustrate  the  subjective  effects  of  this  form  of  processing  we  have  chosen  to  use  our 
'fish  head'  high  resolution  x-ray  slide.  Figures  8  and  9  bring  together  a  group  of 
pictures  each  illustrating  various  aspects  of  the  algorithm.  The  histogram  modification 
seen  in  Figure  8  has  in  general  produced  a  scene  adaptive  contrast  stretch,  within  the 
programmed  32  output  grey-levels  to  the  display  and  has  as  a  result  brought  out  details 
that  were  hidden  due  to  lack  of  local  contrast.  These  effects  may  also  be  seen  in 

Figures  1,  6  and  11.  In  the  latter  figures  the  transfer  function  reflects  these  changes 

and  shows  that  the  video  transfer  characteristic  is  a  monotonicallv  increasing 
indeterminate  function  that  is  scene  dependent  and  not  the  more  usual  log,  exponential  or 
linear  function  associated  with  most  video  systems. 

As  the  processing  operator  relies  upon  statistical  information  for  its  operation,  loss  of 
detail  can  result  from  adjacent  differing  pixels  being  forced  into  the  same  displayed 
grey-level.  This  may  be  detected  in  Figure  8B  and  is  most  certainly  evident  in  the  large 
area  histogram  modifications  of  Figure  10A .  By  adding  2-dimensional  edge  enhancement 
(the  'contour  operator'  in  Figure  6)  to  the  histogram  modified  video  signals  small 
differences  in  contrast  which  exist  at  the  higher  spatial  frequencies  are  high-lighted  as 
shown  in  Figures  8C  and  10B.  However  care  must  be  taken  since  one  can  run  into  noise 
problems  if  excessive  edge  emphasis  is  used. 

Images  which  have  been  modified  so  as  to  give  a  non-rectangular  histogram,  such  as  rising 
or  falling  power  law  or  a  Gaussian  distribution,  are  also  of  interest  and  could  be  of  use 
in  providing  additional  emphasis  in  a  predetermined  range  of  grey-levels.  Figures  9A  and  B 
demonstrate  the  effect  of  rising  and  falling  power  law  distributions  on  our  standard 
picture.  Emphasis  of  the  lower  and  upper  grey-level  distributions  may  be  observed  in  these 
scenes . 

Figure  11  demonstrates  the  use  of  our  histogram  processor  on  an  area  of  the  fish  head  which 
exhibited  a  bi-modal  grey-level  distribution,  see  Figure  11A.  The  output  histogram 
demonstrates  that  our  operator  is  capable  of  dealing  with  this  more  complex  situation.  The 
resultant  transfer  function,  though  monotonic,  exhibits  4  areas  of  compression  and  3  areas 
of  expansion  as  the  histogram  is  in  fact  near  to  being  tri -modal.  In  this  instance  the 


final  processed  information  is  presented  to  the  display  in  64  grey-levels  (6  bits), 
whereas  all  the  previous  output  histogram  modifications  have  expressed  the  displayed  image 
in  32  levels. 


5.  CONCLUSIONS 

We  have  found  that  one  solution  to  the  display  dilemma  discussed  in  the  second  section  of 
this  paper  is  to  perform  real-time  histogram  modification  on  a  selected  section  of  the 
displayed  image,  probably  combined  with  limited  edge  enhancement.  This  operator  provides 
both  the  local  contrast  stretching  and  adaptive  compression  required  to  match  a  wide  range 
of  video  signals  to  the  display.  The  use  of  keyholing  techniques,  however,  limits  the 
application  of  this  form  of  processing  to  a  military,  medical,  or  scientific  environment 
rather  than  to  a  consumer  one;  but  it  does  have  the  advantage  of  allowing  much  greater 
adaptivity  within  the  selected  area,  thereby  bringing  the  most  out  of  a  scene. 

The  quality  of  the  final  image  and  the  effectiveness  of  both  processing  operators,  i.e. 
histogram  manipulation  and  2-dimensional  edge  enhancement  is  ultimately  limited  by  the 
noise  in  the  video  waveform.  A  busy  edge  in  the  image  is  often  quite  acceptable  to  the 
viewer.  The  signal  to  noise  ratio  in  the  video  signal  from  the  scenes  shown  is  of  the 
order  of  46  dB. 

In  the  future  we  will  be  using  our  ACE  processor  to  investigate  the  use  of  other  non-linear 
grey-level  histogram  distributions  whose  selection  could  be  made  scene  dependent.  The  use 
of  a  variable  Keyhole  enhancement  area  permits  us  to  carry  out  detailed  analysis  of  sub- 
frame  areas,  down  to  a  single  pixel  if  necessary.  Such  investigations  are  of  interest  when 
considering  the  application  of  grey-level  modification  on  a  pixel  by  pixel  basis  over  the 
whole  frame. 

My  colleague  Dr  P  K  Bailey  has  been  a  major  contributer  to  this  program  of  work  and  I 
would  like  to  acknowledge  his  able  support. 
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Fig.1  High  contrast  scene  demonstrating  loss  of  detail  in  a  local  area  of  low  contrast 
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Fig.  2  Detectable  noise  limited  signal  levels  for  TV  and  IR  sensors 
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Fig  3  Just  perceptible  relative  intensity  v.  luminance,  for  various  object 
subtenses  -  90“/.  confidence  rating 
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Fig.  4  Discernible  grey-levels  v.  subtense  of  object 
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SUMMARY 

The  performance  in  target  acquisition  or  real-time  reconnaissance  can  be  im¬ 
proved  by  a  team  of  observers.  Different  possible  team  organizations  and  their 
characteristics  are  described.  In  experiments  with  forward  looking  TV-films 
shot  from  a  low  level  flying  aircraft,  different  pseudo-team  algorithms  of  one 
to  four  operators  are  considered.  The  acquisition  performances  are  measured 
with  different  criteria  and  a  cost  function  is  evaluated  which  weights  the  suc¬ 
cess  and  confidence  of  the  acquisition  and  the  time  expense. 
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1 .  INTRODUCTION 

For  air  attack  RPV's  and  under  special  conditions  also  for  reconnaissance 
RPV's,  the  typical  mission  profile  could  be  a  fast  terrain  following  flight. 
For  both  tasks  the  RPV  would  contain  forward  looking  image  sensors.  Due 
to  the  real-time  requirements  to  achieve  on-line  decisions,  the  continuous 
transfer  of  image  data  to  a  ground  station  via  a  telemetric  data  link  would 
be  necessary. 

The  information  will  be  displayed  to  one  or  more  observers  by  monitors. 

Their  task  includes  the  detection,  classification  and  identification  of 
targets.  This  task  has  to  be  solved  in  a  short  time,  for  normally  the 
targets  have  to  be  acquired  (and  possibly  attacked)  during  a  single 
approach.  The  difficulty  of  acquisition  is  inherent  in  the  following 
boundary  conditions: 

-  Narrow  field  of  view  resulting  from  image  sensor 

-  Low  image  quality  due  to  small  data- link  bandwidth 

-  Fast  moving  scene 

-  Multiple  targets 

-  Camouflaged  targets 

As  it  could  been  shown  in  /6/  a  single  observer  cannot  cope  with  these  con¬ 
ditions  to  achieve  an  acceptable  acquisition  rate.  An  approach  to  increase 
acquisition  performance  is  to  use  a  well  organized  team  of  operators  which 
can  overcome  the  problems  of  small  time  budget  and  high  required  acquisi¬ 
tion  rates  with  a  minimum  of  omissions,  false-alarms  and  wrong  classifica¬ 
tions  or  identifications.  The  definition  of  an  optimized  team  structure 
has  to  be  based  on  a  survey  of  existing  literature  and  task-adequate  ex¬ 
periments.  The  experiments  performed  here  should  answer  the  question,  how 
many  team-members  in  which  decision  making  organization  give  the  best  re¬ 
lation  success  to  expense. 
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2.  PRINCIPLES  OF  TEAM  ORGANIZATION 
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Team  organizations  can  be  divided  into  non-redundant  (every  operator  ob¬ 
serves  a  different  part  of  the  image  data)  and  redundant  (all  operators 
observe  the  same  image  data)  as  it  is  shown  in  Figure  1 .  Non-redundant 
teams  can  share  their  work  in  contents  or  sequence.  In  case  of  a  con¬ 
tents  sharing  structure  an  increase  of  the  operators  number  increases  per¬ 
formance  by  the  same  factor  /2/.  This  has  not  been  proved  yet  for  the 
division  into  job-sequences. 

For  target  acquisition  by  RPV  with  a  single  sensor  the  first  approach  to 
an  optimized  team  structure  should  be  a  redundant  organization  to  assure 
a  minimum  of  missed  and  false  classified  targets.  A  subdivision  of  re¬ 
dundant  team  organization  leads  to  real  teams  and  pseudo-teams  determined 
by  the  kind  of  communication  between  team-members. 


2.1  REAL  TEAM  WITH  REDUNDANT  STRUCTURE 

A  real  team  of  operators  cooperates  directly  to  solve  the  observation  task, 
a  pseudo-team  organization  connects  the  results  of  single  observers  by  an 
algorithm  without  intercommunication. 

A  real  team  with  non-hierarchical  structure  has  always  a  better  performan¬ 
ce  than  single  observers  for  errors  are  eliminated  by  other  team  members 
/3,  4,  5,  6,  7/.  The  performance  increases  with  growing  team  size,  the 
greatest  increase  is  achieved  by  adding  a  second  observer  to  a  single  one. 
Further  enlarging  of  the  team  leads  asymptotically  to  a  certain  level  in 
performance  /2,  8/.  A  team  of  the  described  structure  is  significantly 
improving  the  acquisition  especially  when  task  difficulty  /9/  and  task 
frequency  /4/  are  high. 

The  real  team  with  hierarchical  structure  can  evolute  or  can  be  defined 
formally.  The  genuine  team  evolutes  due  to  the  differing  dominance  of 
team  members.  A  maximum  performance  of  a  hierarchical  team  presumes  a  be¬ 
haviour  of  single  members  between  authoritarian  and  democratic.  Total 
lack  of  dominance  as  well  as  a  significantly  dominant  leader  would  decrease 
performance.  A  superior  performance  can  be  expected  if  all  team  members 
are  active  and  their  cooperation  is  organized  by  a  coordinator  /10/. 

A  formally  defined  hierarchy  requires  that  the  most  experienced  operator 
acts  as  team  leader  /II/. 


2.2  PSEUDO-TEAM  WITH  REDUNDANT  STRUCTURE 

This  kind  of  team  solves  the  acquisition  task  without  interpersonal  commu¬ 
nication.  An  algorithm  combines  the  results  of  the  single  team  members  to 
a  final  decision.  The  algorithm  has  to  be  chosen  in  regard  to  the  desired 
performance  criteria  of  the  team. 

If  the  observation  task  leads  to  a  yes-no-decision  two  sets  of  algorithm 
can  be  separated.  In  a  serial  algorithm  the  team  decision  is  positive  if 
all  single  results  are  positive.  A  parallel  algorithm  produces  a  positive 
decision  of  the  pseudo-team  if  one  single  decision  is  positive  (Fig.  2) . 

The  serial  algorithm  minimizes  the  false-alarm  rate,  the  parallel  algorithm 
maximizes  the  detection  rate  /1 2/.  The  mixture  of  both  algorithms  leads 
to  a  partly  serial  algorithm  where  a  defined  number  of  positive  single  de¬ 
cisions  are  necessary  to  produce  a  positive  decision  of  the  pseudo-team. 

The  serial  as  well  as  the  parallel  types  of  algorithms  increase  detection 
rate  with  increasing  number  of  team  members  asymptotically  /1 3/. 

ffhe  final  level  of  detection  rate  is  always  lower  in  the  serial  case  than 
in  the  parallel  /8 ,  14/. 

The  false-alarm  rate  increases  linear  with  team-size  using  a  parallel  algo¬ 
rithm  /1 3/  whereas  the  serial  one  is  minimizing  this  kind  of  error,  so 
that  independent  observers  have  a  better  conformity  in  detecting  targets 
than  confusing  objects  /1 2/. 

It  is  also  possible  to  weight  the  single  observers  decisions  in  the  pseudo¬ 
team  algorithm.  This  is  necessary  if  a  confidence  or  probability  has  to 
be  assigned.  No  influences  on  the  team  performance  was  found  /1 5/  with 
the  weightings: 

-  unique 

-  not  unique,  assigned  by  self-rating 

-  not  unique,  resulting  from  past  performance 

Real  teams  and  pseudo-teams  have  some  mixed  forms  with  different  feedbacks. 

-  Information  about  the  decision  of  other  team  members  increases  perfor¬ 
mance  of  a  serial  pseudo-team  /1 6/. 


-  The  results  of  a  pseudo-team  can  be  fed  back  to  the  single  observers  and 
discussed  in  a  real  team  (team  consensus  feedback)  or  they  can  be  dis¬ 
played  one  /1 7/  or  several  /1 5/  times  to  the  pseudo-team  (consensus 
feedback) .  The  team  consensus  feedback  reduces  the  false-alarm-rate 

mi. 

-  Results  of  a  pseudo-team  can  be  observed  by  the  single  operators  without 
communication  /9/. 

A  raw  comparison  of  team  organizations  gives  the  following  ranks  in  regard 
to  detection  rate  and  speeds  /6,  9,  12/: 

1)  parallel  pseudo-team 

2)  real  team 

3)  single  observer 

4)  serial  pseudo-team 

The  ranks  in  minimizing  false-alarm-rate  are 

1 )  serial  pseudo-team 

2)  real  team 

3)  single  observer 

4)  parallel  pseudo-team 

This  survey  shows  that  a  complex  task  as  under  consideration  here  has  not 
been  investigated  in  the  literature,  but  basic  decisions  can  be  made  how 
to  chose  a  team.  The  choice  has  to  be  made  between  pseudo-teamB  and  single 
observers  because  real  teams  can  be  neglected  in  this  real  time  conditions. 
In  the  two  rank-scales  a  pseudo-team  gives  the  best  results  in  both  cases 
but  an  optimum  has  to  be  found  between  the  serial  and  the  parallel  type  due 
to  the  defined  aims  in  target  acquisition. 


3 .  EXPERIMENTS 

The  reported  experiments  and  results  are  a  selection  from  /1 8/. 


3.1  EXPERIMENTAL  SET-UP 

Figure  3  shows  the  experimental  set-up.  In  a  simulation  cabin  of  2,5  m  x 
5  m  four  operators  are  sitting  in  front  of  TV-monltors  with  23  respective 
31  cm  diagonals.  They  watch  a  TV  film  replayed  by  tape  recorder  which  con¬ 
tains  fifteen  film  scenes  of  45  -  60  sec  duration  with  a  pause  picture  of 
3  sec  inbetween.  The  films  were  taken  during  an  experimental  Remote  Target 
Acquisition  Program  (KEL/TZE) .  They  were  made  with  a  625  line-camera  with 
zoomable  field  of  view  between  4,5°  -  45  and  variable  pitch  between  3  and 
1 5  degrees . 

All  operator  panels  are  equipped  with  light-pens  to  mark  the  x/y-coordi- 
nates  of  objects  like  cars,  trucks  and  tanks  and  four  push  buttons  for  the 
classification  to  be  operated  by  the  left  hand.  The  operator  places  were 
separated  by  thin  walls.  All  information  were  fed  into  the  computer  for 
on  and  off-line  evaluation. 


3.2  EXPERIMENT  ORGANISATION 

The  experiments  were  developed  during  pretests  with  four  operators.  In  the 
preliminary  trials  it  became  obvious  that  identification  was  not  possible 
due  to  image  quality.  Therefore  the  assumption  was  made  that  all  targets 
were  "enemy"  targets  implied  by  tactical  situation.  The  task  of  the  opera¬ 
tors  was 

-  to  detect  and  mark  coordinates  by  light  pen 

-  to  classify  with  push-buttons 

as  quick  and  correct  as  possible  when  they  have  seen  a 


-  truck  or 

-  tank 

The  main  tests  were  made  with  two  groups  of  four  operators  which  were  em¬ 
ployees  of  the  company  with  no  experience  in  target  acquisition  on  moni¬ 
tors. 

The  duration  of  a  test  run  was  15  minutes.  Before  a  trial  a  written  in¬ 
struction  was  given  to  the  operators.  Also  the  handling  of  the  light  pen 
and  the  push-buttons  were  trained  before  each  test-run.  One  to  two  runs 
per  day  were  made. 
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All  film  scenes  contained  a  total  of  seventeen  targets.  Some  scenes  had 
more  than  one,  others  no  target.  The  performances  of  each  test-run  were 
evaluated  and  after  the  ninth  run  the  operators  were  called  trained.  The 
test  serie  was  prolonged  until  thirteen  runs  so  that  four  tests  for  final 
evaluation  are  usable. 

It  was  a  deficiency  of  the  experiments  that  only  one  film  was  available 
so  that  the  learning  behaviour  of  the  operators  is  not  representative  for 
real  situations. 


3.3  TV-MATERIAL  USED 

In  /I/  scene  parameters  as  context,  target  size,  contrast  etc.  are  de¬ 
fined.  Table  1  shows  the  regarded  scene  parameters.  For  parameters  not 
considered  no  data  were  available  or  could  not  be  used  statistically. 

Table  2  shows  the  evaluated  parameters  of  all  targets.  The  criterion  value 
is  the  mean  over  eight  operators  and  four  trials  for  each  single  target. 

Figure  5  shows  the  histograms  of  target  and  background  brightness  and  con¬ 
trast  and  the  mean  and  +  1 O  values  for  the  three  target  categories. 

Table  3  gives  detailed  information  about  scenes  parameters  and  their  in¬ 
fluence  on  acquisition  measures  and  criterion  function  value.  Interesting 
is  that  there  is  no  significant  influence  of  target  motion  on  acquisition 
performance  as  is  proved  for  acquisition  with  non-moving  camera.  Here  it 
has  to  be  recognized  that  the  scenes  themselves  are  dynamically  and  that 
there  were  only  three  fast  moving  targets . 

It  can  be  seen  that  the  target-background  contrast  has  a  significant  in¬ 
fluence  on  acquisition  performance. 

Interesting  is  a  comparison  between  a  subjective  classification  of  the  dif¬ 
ficulty  to  detect  and  classify  special  targets  and  the  defined  criterion 
function  J.  It  can  be  seen  that  the  J-f unction  value  correlates  very  well 
with  the  subjective  difficulty,  see  table  3. 

Table  4  gives  a  qualitative  weighting  of  scene  parameters  on  acquisition 
performance . 

Acquisition  performance  is  improved  by  target  brightness,  target  back¬ 
ground-contrast,  civil  cars  and  non-moving  targets,  it  decreases  with 
camouflaged  targets  like  tanks. 


3.4  DEFINITION  OF  MEASURED  VALUES 

The  following  definitions  were  used  to  describe  the  target  characteristics, 
the  single  operator  performance  and  the  team  performance: 

-  Display  time  Tz: 

Period  in  which  the  target  is  detectable,  starting  with  the  earliest 
time  the  target  could  be  detected  and  ends  with  the  disappearance  from 
the  screen. 

-  Contrast  K 


with  L.j :  Brightness  of  the  brighter  object 
L^:  Brightness  of  the  darker  object 

-  Detection  time  T£: 

Period  to  detect  a  target  correctly  and  setting  coordinates  by  light-pen, 
starting  with  the  time  the  target  can  be  detected. 

-  Classification  time  TR: 

Period  between  the  correct  detection  and  the  correct  classification  of  a 
target  (car,  truck,  tank) .  The  coordinates  are  taken  as  correct,  if  they 
correspond  to  the  apriorl  known  target  position. 

-  Total  time  TQ:  Period  to  detect  and  classify  a  target  correctly 


te  +  tk 


The  following  performance  Indicator  dimensions  are  %-based  on  the  number 
of  displayed  targets  (Fig.  4  gives  a  systematic  overview) : 


Correct  Detections  N^ 

Missed  Detections  (Omissions)  NN£ 

False  Alarms 

with 

nne  +  nre  =  100  % 

0  <nfe  < 00 

Correct  Classification  N^ 

Missed  Classification  NNK 
False  Classification  NpK 

with 

nnk  +  nfk  +  nrk  =  100  % 

The  term  decision  was  used  in  relation  to  the  pseudo-team  performance  with 
the  assumption  that  a  correct  decision  of  the  team  presumes  a  correct  de¬ 
tection  and  classification  of  the  team  members  involved. 
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To  quantify  target  acquisition  performance  of  one  operator  or  different 
teams  of  operators  the  performance  indices  "Success",  represented  by  per¬ 
cent  correct  classification,  "Confidence"  by  percent  false  classification 
and  "Speed"  by  the  total  time  needed  in  relation  to  the  mean  time  of  tar¬ 
gets  displayed  are  combined  in  a  performance  criterion  J. 


The  criterion  J  is  defined  as 


J  -  *  SRK  -  b  NFK  +  C(?Z  -  V 

This  empirical  criterion  function  suits  for  real-time  tasks  and  can  be 
used  for  the  comparison  of  different  acquisition  teams.  The  bars  indicate 
mean  values. 

It  can  also  be  used  to  look  at  the  influence  of  different  scene-parameters 
on  the  acquisition  performance. 


4 .  RESULTS 


4.1  SINGLE  OPERATOR  PERFORMANCE 

To  get  a  quasi  stationary  operator  behaviour,  it  was  necessary  to  run  the 
tests  thirteen  times.  Several  performance  criteria  of  each  operator  were 
calculated.  As  an  example  the  results  for  the  operator  number  four  are 
shown  in  Fig.  6  to  8. 

It  can  be  seen  that  some  of  the  performance  measures  give  redundant  infor¬ 
mation. 

Looking  at  the  behaviour  of  all  operators  the  following  observations  were 
made: 

-  The  total  time  T  and  their  variance  has  a  very  small  tendency  to  de¬ 
crease  “ 

-  The  classification  time  is  nearly  constant  that  means  it  shows  more  the 
time  to  handle  the  equipment  than  a  decision  time.  It  is  thought  that 
the  decision  is  made  during  the  detection  process. 

-  The  detection  time  is  responsible  for  the  shape  of  the  total  time  TQ. 

-  There  is  only  a  small  difference  between  correct  detections  and  correct 
classifications  that  means  the  classification  decision  was  included  in 
the  detection  decision. 
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-  Wrong  classifications  ars  naarly  constant  over  tast  runs. 

-  False  detections  vary  very  much  from  test  to  test.  It  shows  the  activa¬ 
tion  level  and  the  risk  behaviour  of  the  operators. 

-  The  tendency  can  be  seen  that  operators  with  many  false  detections  also 
have  many  right  detections. 


4.2  CRITERION  FUNCTION 

Fig.  9  to  1 1  show  the  mean  values  of  the  last  four  test-runs  of  every 
operator  for  all  performance  measures  and  the  mean  value  of  the  group. 

Fig. 12  shows  different  criterion  function  values  for  every  operator  of 
group  one  and  two.  A  variation  of  the  coefficients  should  give  a  feeling 
for  the  sensitivity  of  the  criterion  concerning  the  weighting  of  success, 
confidence  and  speed. 

Which  coefficients  (a,  b,  c)  have  to  be  taken  for  each  of  the  three  per¬ 
formance  factors  depend  on  the  aims  and  criteria  of  the  task. 

For  the  comparison  of  team  algorithm  a  unique  weighting  was  chosen  with 
la  *  1,  b  *  1,  c  ■  30) . 

The  rank  of  the  operators  in  relation  to  their  criterion  value  (best,  se¬ 
cond,  third  performance)  depend  on  the  weighting  coefficients  used. 

The  result  for  individually  parallel  work  of  all  operators  is  J  «  58,5  with 
0*  14,1.  This  is  the  basis  for  discussions  of  improvement  of  performance 
by  pseudo-teams. 

The  mean  target  display  time  was  2,9  sec  with 6*  1,5.  In  /  1  /  it  was  found 
that  display  time  of  2  sec  resulted  in  a  heavy  operator  load.  This  can  be 
confirmee)  by  the  tests. 


4.3  TEAM  DECISION  "N  OUT  OF  FOUR" 

Four  different  algorithms  are  possible.  One  out  of  four  means:  The  quick¬ 
est  classification  is  taken  as  the  decision  of  the  team.  The  time  needed 
(total  time)  is  the  one  of  this  operator  (parallel  pseuao-team) . 

Two  out  of  four  means:  The  first  two  equal  classifications  are  taken  as 
decision  of  the  team  (mixed  parallel-serial  pseudo-team) .  The  time  needed 
is  the  time  of  the  last  of  the  two  operators. 

Three  out  of  four  is  equivalent  to  two  out  of  four. 

Four  out  of  four  means:  All  operators  have  classified  the  target  identi¬ 
cally.  The  time  needed  is  the  one  of  the  slowest  operator  (serial  pseudo¬ 
team)  . 

Fig.  13  shows  the  results  for  both  groups  of  operators.  One  out  of  four 
resulted  in  a  better  performance  of  about  25  %  than  the  best  single  operator 

The  more  operators  are  involved,  the  slower  is  the  decision  that  means  the 
total  time  goes  up  (40  4),  the  number  of  no  decisions  goes  up  (43  %) ,  the 
number  of  correct  decisions  goes  down.  The  positive  effect  is  that  the 
number  of  wrong  decisions  goes  down  (11  4). 

Fig.  14  shows  the  sensitivity  of  the  criterion  function  for  the  four  pseudo¬ 
teams.  It  can  be  seen  that  the  algorithm  one  out  of  four  gives  the  best 
performance,  if  number  of  targets  (success),  number  of  false  alarms  (con¬ 
fidence)  and  time  needed  (speed)  are  equally  weighted. 


4.4  TEAM  DECISION  "ONE  OUT  OF  M  " 

The  evaluation  of  this  algorithm  should  show  the  "cost  effectiveness"  of 
the  number  of  operators. 

A  problem  was,  which  operators  of  a  group  should  be  taken  for  one  out  of 
one,  one  out  of  two  and  one  out  of  three.  It  was  decided  to  take  not  the 
best  or  the  worst. 

Fig.  15  shows  the  results  for  the  second  team.  With  the  number  of  opera¬ 
tors  the  time  needed  (total  time)  goes  down,  correct  decisions  increase, 
number  of  no  decision  decreases,  but  wrong  decisions  increase. 

The  sensitivity  of  the  results  as  a  function  of  weighting  factors  for  suc¬ 
cess,  confidence  and  speed  is  shown  in  Fig.  16. 


It  can  be  stated  that  one  out  of  four  is  25  t  and  one  out  of  three  14  % 
better  than  the  best  single  operator. 

The  optimum  of  the  criterion  function  lies  between  three  and  four  opera¬ 
tors  depending  on  their  skill  and  the  weighting  factors,  1.  e.  the  aimes 
of  target  acquisition. 


CONCLUSION 

The  optimal  size  of  a  pseudo-team  is  three  to  four. 

Assumed  that  there  is  an  equally  weighting  of  success,  confidence  and 
speed  the  best  performance  is  given,  when  the  first  decision  of  one  of  the 
operators  is  taken  (parallel  pseudo-team) . 

The  investigated  non-hierarchical  team  structure  has  a  disadvantage,  be¬ 
cause  each  operator  has  to  perform  two  tasks,  detecting  targets  by  search¬ 
ing  the  possible  area  of  appearance  and  tracking  of  targets  until  a  clas¬ 
sification  is  possible.  If  there  are  targets  in  a  short  sequence,  it  can 
happen  that  they  are  over- looked.  This  can  be  improved  by  a  hierarchical 
team. 

It  is  assumed  that  with  image  enhancement  techniques  such  as  gray  scale 
manipulations ,  the  acquisition  performance  can  be  improved. 

In  the  future  it  should  also  be  investigated  how  a  priori  information 
(like  high  altitude  reconnaissance)  will  improve  the  operator  performance. 

Additionally  the  influence  of  an  automatic  detection  device  on  the  total 
acquisition  performance  should  be  investigated. 
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List  of  possible  scene  parameters 

Parameters  considered 

Target  size 

resulting  from  target 

shape 

Number  of  targets  per  TV-scene 

1-6  (identical  targets) 

Shape  of  target 

car, truck,  tank 

Target  brightness 

yes 

Target  color 

- 

Number  of  target  categories 

3 

Target  grouping 

- 

Target  motion 

yes 

Background  brightness 

yes 

Target-background  contrast 

yes 

Context  of  background 

yes 

Target  density 

- 

Target  position,  traces 

- 

Environmental  parameters 

Table  1 :  Scene  Parameters 


Target 

No. 

Class 

Context 

Movement 

Difficulty 

<£|> 

lh 

m 

K 

(s) 

Tz 

(8) 

(s) 

ZRK 

(%) 

ZFK 

(%> 

a  =  1 

J  b  *  1 
c  *30 

1 

CAR 

Forest 

fast 

medium 

6 

0,7 

89 

2,3 

2,3 

3, ' 

0 

38 

2 

4  TANK 

Heather 

no 

difficult 

6,8 

8,6 

35 

4,5 

3,8 

78 

47 

-25 

3 

CAR 

Trees 

no 

easy 

9,4 

4,5 

52 

3,0 

2,4 

100 

3 

94 

4 

TRUCK 

Street 

slow 

easy 

10,2 

2,7 

73 

2,2 

1,6 

100 

0 

135 

5 

CAR 

Trees 

no 

difficult 

11,8 

3,0 

75 

0,8 

1,7 

47 

0 

46 

6 

CAR 

Meadow 

no 

medium 

15,6 

8,3 

47 

2,9 

2,8 

34 

0 

19 

7 

TRUCK 

Trees 

no 

medium 

6,4 

4,4 

31 

1,5 

2,0 

100 

0 

113 

3 

TANK 

Trees 

fast 

easy 

18,8 

7,2 

62 

2,8 

2,2 

100 

3 

121 

9 

5  TANK 

Heather 

no 

difficult 

1.6 

1,9 

18 

7,3 

5,2 

88 

22 

-43 

10 

TRUCK 

Trees 

slow 

difficult 

4,4 

8,1 

46 

2,0 

2,7 

6 

3 

3 

1 1 

TRUCK 

Trees 

slow 

medium 

6,8 

2,2 

75 

4,6 

5,3 

97 

0 

1  3 

12 

TRUCK 

Trees 

fast 

difficult 

16,7 

11,1 

34 

4,0 

2,6 

34 

59 

-23 

13 

6  TANK 

Heather 

no 

easy 

5,2 

1 1  ,0 

53 

3,1 

1,2 

97 

3 

1  1  1 

14 

TANK 

Heather 

no 

easy 

13,3 

1  ,7 

79 

2,7 

2,1 

88 

6 

81 

15 

TRUCK 

Heather 

no 

medium 

6,7 

0,7 

63 

2,5 

2,0 

59 

6 

72 

16 

TRUCK 

Heather 

slow 

difficult 

1,6 

5,  1 

29 

1,5 

0,9 

9 

9 

49 

17 

TRUCK 

Heather 

no 

difficult 

0,9 

3,2 

93 

1,6 

2,1 

22 

3 

50 

Table  2:  Characteristic  data  and  performance 
measures  of  all  targets 


Individual  Past 

Estimation  Performance 


Fig.  1  Types  of  team  organisations 


Input 


Input 


Input 


/L 

2 


Operator  1 
Serial  decision  algorithm  (4  out  of  4) 


S' L. 
3 


4 


Parallel  decision  algorithm  (1  out  of  4) 


[ _ L 

Mixed  serial -parallel  decision  algorithm  ("2  out  of  4") 


Decision 


1 — °  Decision 


-o  Decision 


Fig.  2  Pseudo-team  algorithms 
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TIME  MEASURES 


*  Time  target  is 
displayed  Tz 


*  Total  time  Tg 
sum  of  Tg  and  Tr 

*  Classification  time  T« 
for  correctly  classified 
targets  (difference  be¬ 
tween  Tg  and  actual 
time) 

*  Detection  time  Te  for 
correctly  detected  targets 
(difference  between  To 
and  actual  time) 


T0 


Fig.  4  Sequence  and  data  collection  in  target  acquisition 


1  2  3  4  15  8  7  8  operator 

Group  1  )  Group  2 


123415178 


*  2  ]  4  !  5  t  7  I 


Ft*.  9  Moan  time  measures  of  all  operators  averaged 
over  Ihe  trials  10 -13 


1  2  3  4  j  S  «  7  I 
Group  1  |  Group  2 


1  2  3  4  5  S  7  t 


-  V--Va- 


1  2  }  4  $«  7  I 

FI#.  11  Maan  dasufication  part ormanct  of  all  oparators 
avenged  over  the  trials  10  - 13 


1  2  1  4  |  5  S  7  S 

Group  1  I  Group  2 


1  2  3  4  $  f  7  > 


I  2  J  4  5  i  7  8 


Fig.  10  Mean  detection  parfonnance  of  all  oparators 
averaged  over  the  trials  10-13 


f¥s 


pg»  operator 
Jt:a=3,b=1,c=10 
J2:a=2.b=1,c=10 
J3:4»1J»*U*tO 
J4:a=1,b=1,c-30 
J5:a=13»=U=20 
=  Jg:a=1.b=1.c=10 
=  J70=1  ,b=1  ,c=  1 0 
J8»=1.b=2.c=10 
i9#=13»=3.c=10 


2  3  4  5  6  7  8 


1  2  3  4  5  f  7  a 

Fig.  12  Sansitivity  of  cost  function  to  the  value  of 
weighting  coefficients  for  single  opera  ton 
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Fi*.  16  SomftMty  of  coit  function  to  ttw  whit  of  wtifhtini  cotffltiwtt 
for  "lout  of**"  with  roup  2 
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ABSTRACT 

A  two-dimensional  data-compression  method  is  described  which  is  based  on  a  least-square  image  approxima¬ 
tion  with  use  of  splines.  A  complete  analysis  of  this  process  in  the  frequency  domain  is  given.  For  the  data- 
compression  process  the  feasibility  of  some  candidate  realisations  of  the  real-time  operating  hardware  is 
described.  The  image  reconstruction  from  the  compressed  data  set  consists  of  an  off-line  computation  of  a  modi¬ 
fied  compressed  data  set  followed  by  a  real-time  replay  interpolation  process. 

In  order  to  achieve  a  better  image  reconstruction  a  flexible  edge  enhancement  algorithm  was  developed. 


1 .  INTRODUCTION 


Limiting  the  amount  of  sensor  data  to  be  transmitted  and/or  processed  may  considerably  impact  on  the  total 
enu-to-end  sensor  system  design.  Especially  with  imaging  type  of  sensors,  data  reduction  techniques  may  be 
beneficially  applied,  due  to  the  redundancy  in  the  images. 

At  the  National  Aerospace  Laboratory,  data  reduction  techniques  based  on  3pline  approximation  are  inves¬ 
tigated  for  application  to: 

-  remote  sensing  systems  (LANDSAT) 

-  sensor  systems  (IR,  SLAB) 

-  airborne  data  acquisition  systems. 

Spline  approximation  techniques  achieve  data  reduction  by  entropy  reduction  in  the  spatial  domain.  Atten¬ 
tion  to  these  techniques  was  drawn  because  of  the  attractive  local  character  of  the  operations  to  be  performed, 
and  further  because  of  the  optimal  interpolation  property  of  splines. 


OBJECT 


Fig.  t  .1  Application  of  data  reduction  and  reconttruction  in  an  airborne  or  ipaceborne  sensor  system 


The  results  so  far  are  so  promising  that  application  is  planned  in  the  data  processing  of  the  Dutch  Infra¬ 
red  Astronomical  Satellite  IRAS. 

These  techniques  are  expected  to  be  very  useful  as  well  for  reduction  of  data  to  be  processed  for  target 
acquisition  and  recognition.  In  this  case  one  may  think  of  reducing  the  data  from  a  sensor  system  onboard  the 
aircraft,  in  order  to  reduce  the  bandwidth  needed  for  transmission  or  the  capacity  of  a  storage  medium.  In  the 
sequel  a  data  reduction  method  is  described  which  is  based  on  least  squares  approximation  of  the  image  with  a 
two-dimensional  spline  function.  This  spline  function  is  completely  defined  by  a  limited  number  of  characteris¬ 
tic  parameters.  Thi3  number  is  less  than  the  original  number  of  pixels,  so  data  reduction  can  be  achieved  by 
calculating  the  characteristic  parameters  of  the  approximating  spline  out  of  the  original  pixel  values.  The 
image  can  be  reconstructed  from  this  compressed  data  set. 

The  data-reduction  process  associated  with  the  application  of  the  spline  approximation  is  a  simple  convolu¬ 
tion  filter  process,  yielding  the  compressed  data  set.  Image  reconstruction  is  achieved  by  first  preprocessing 
the  compressed  data,  which  yields  a  modified  compressed  data  set,  and  thereafter  interpolating  this  data  on  a 
(variable)  grid.  The  preprocessing  is  a  complex  deconvolution  process.  The  interpolation  process,  however,  is 
simple  again,  because  only  a  finite  number  of  values  from  the  modified  compressed  data  set  (depending  on  the 
order  of  the  spline)  are  needed  for  the  computation  of  the  intermediate  points.  A  complete  analysis  of  this 
data  reduction-reconstruction  process  in  the  frequency  domain  is  given  in  chapter  3  (Renes,  J.,79). 

Data  reduction  with  this  method  results  in  a  moderate  decrease  of  spatial  resolution.  In  case  of  an  air¬ 
borne  scanner  system  this  is  ofter  acceptable  because  of  strong  correlations  between  adjacent  pixels  (over- 
sampling  in  line  direction  and  overlap  of  scanline)  resulting  in  a  resolution  sufficient  for  target  acquisi¬ 
tion  and  recognition.  If  the  decreased  spatial  resolution  is  not  acceptable  some  improvement  can  be  obtained 
by  means  of  an  edge  enhancement  technique.  Further,  resolution  differences  in  flight  and  cross  track  direction 
can  be  accounted  for,  by  using  different  reduction  factors  in  both  directions. 

In  view  of  the  airborne  application  and  the  data  rates  involved  a  hardware  implementation  of  the  compres¬ 
sion  scheme  is  of  interest.  In  chapter  U  a  detailed  treatise  is  given  of  some  candidate  hardware  implementa¬ 
tions,  for  the  two-dimensional  data-reduction  process.  Also  some  remarks  on  the  applicability  of  the  proposed 
data  reduction-reconstruction  process  are  given.  In  chapter  5  attention  is  given  to  the  theoretical  backgrounds 
of  an  edge  enhancement  algorithm.  Finally,  results  obta.' ned  with  Landsat  data  and  some  concluding  remarks  are 
given  in  chapter  6  and  7. 


2.  BASIC  CONCEPTS 

2.1  Splines  and  B-splines  defined 

A  spline  is  defined  as  a  function,  S(t)  say,  that  results  from  piecing  together  polynomial  arcs  of  order 
k  (degree  k-1)  as  follows 
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-  each  arc  is  defined  over  a  unit  length  interval;  the  interval  end  points  are  called  the  knots, 

-  at  each  knot  the  arcs  match  in  value  (except  for  k«l),  also  the  derivatives  up  to  k-1  match  then 
in  value . 

-  for  even  (odd)  k  the  knots  are  at  (half-)  integer  values  of  t. 

The  B-spline  (B  stands  for  Basic)  B^(t )  of  order  k  is  defined  recursively  by  convolutions  as 

B.(t)  «  1  when  |t|  <  J  and  0  elsewhere 


(2.1) 


Bk(t)  »  (Bk_,*  B^U) 


/  Bfc_1(s)  B^ (t-s)ds 

i 

S 
-} 


(2.1) 


7  Bk_](t-s)ds 


Consequently  Bk(t)  satisfies  the  following  properties 

(a)  Bk(t)  =  0  | t |  >  k/2 

(b)  B^tt)  is  a  spline  of  order  k 

(c)  B^t)  is  positive  and  has  unit  area 

(d)  E  B.(t-v)  =  1,  all  t 


(2.2) 


Explicit  expressions  for  the  first  three  B-splines  are 


.  |t|  «  J 


B,(t)  =  1 
B2(t)  -  1-| bj  ,  |t|  <  1 

i  -  t2,  |t| 

B-,(t)  = 


i(4-  M)2,  i  <  |t|  <| 


The  name  B(asic) -spline  steins  from  the  following  representation  property: 

Any  spline  S(t)  can  be  represented  by  the  B-spline  series 

S(t)  =»  E  C  B  (t-v)  (2.3) 

y  v  » 

If  S(t)  is  defined  over  n  intervals  the  number  of  coefficients  Cy  is  n+k-1 . 

Mote  that  due  to  (2.2-a)  only  k  consecutive  coefficients  enter  in  the  evaluation  of  the  series  (2.3);  e.g.  for 
k»3  one  has  for  |e|  <  } 

S(v+e)  “  Cy_1J( J+e)2  +  Cv(J-e2)  +  Cv+1i(i-e)2 

Hence  the  B-spline  coefficients  are  a  sequential,  and  local  representation  of  the  spline  S(t). 

The  notion  of  B-splines  can  be  generalised  to  two-dimensional  functions  using  Cartesian  products.  Instead 
of  knots  one  now  has  a  grid  of  lines  t=v  and  s=»y .  The  B-splines  series  then  becomes 

S(t,s)»  E  E  D  a  (t-v)  B.  (s-p)  (2.3a) 

V  y  Vp  K.  K 

In  this  case  only  )s?  coefficients  enter  in  the  evaluation  of  the  two-dimensional  spline  function.  The  function 
B^(t)  B^(s)  is  displayed  in  figure  A1. 

2.2  Least  squares  approximation 

Data  reduction  can  be  achieved  by  approximating  a  given  function  X(t)  with  a  spline  S(t).  In  general  the 
function  X(t)  is  described  by  a  set  of  samples  while  the  spline  S(t)  is  described  by  fewer  spline  coefficients. 
There  are  many  ways  to  specify  a  computation  rule  for  the  B-spline  coefficients  to  make  the  resulting  spline 
S(t)  "resemble"  a  given  input  function  X(t).  In  least-squares  approximation  the  coefficients  are  determined  by 
requiring  the  total  power  in  the  error  curve  X(t)-S(t)  to  be  minimal. 


MIN  Et  ,(C  ) 
c  tot  v 


/  (X(t)  -  E  C  B.  (t-v) )‘ 

v  K 


dt 


For  convenience  only  the  continues  case  is  treated,  the  discrete  case  leads  to  analogue  formulations. 
By  scaling  the  time  t  it  is  no  restriction  having  the  spline  knots  equally  spaced  with  distance  1 . 

A  system  with  linear  equations  ensues  from  setting  to  zero  all  partial  derivatives 

3Etot/3Cv  *  0 

The  system,  also  called  the  normal  equations,  has  the  typical  equation 


/X(t)  Bk(t-v)dt 


■  E  / B.  (t-v)  B.  (t-u)dt  C 

K  K  U 

*  J  <Bk*Bk)(v-p)  CM 


M 

E  B, 


(2.U) 


(2.5) 


(2.6) 


2k^P  *  Cv+p 


In  the  last  step  the  symaetry  and  auto-convolution  properties  of  the  B-splinea .are  used. 

Before  solving  the  normal  equations,  the  left  part,  also  called  the  convolutions,  has  to  be  computed  from 
the  input  data  X(t).  In  appendix  A,  two  methods  are  given  for  solving  the  normal  equations.  When  the  normal 
equations  (2.6)  are  solved,  the  spline  S(t)  can  be  computed  from  equation  (2.3). 

The  analogue  to  (2.6)  for  image  approximation,  with  image  function  X(t,s)  is 


l!  X(t,s)  Bk(t-v)  Bk(s-v)  dtds 


“  B2k<p)  B2k  (T)  DVp, 


(l+T 


(2.T) 


Before  solving 
input  data  X(t 
in  t-direction 


solving  equation  (2.7)  the  left-hand  part  of  this  equation,  denoted  by  has  to  be  computed  from  the 

ata  X(t,s).  This  computation  can  be  done  in  two  consecutive  steps,  first  ’Compute  the  convolution  P  (s) 


P  (s)  ■  /  X( t ,s )  B.  (t-v)dt 

V  K 


next  compute  the  final  convolution  from  Py(  a ) 

u*  /  Py(s)  Bj^ts-w )ds 


By  inspection  of  (2.6)  and  (2.7)  it  follows  that  (2.7)  can  be  solved  in  two  consecutive  steps; 
first  compute  d  from 

n  B.  (t-v)  B.  (s-u)  X(t ,s )  dtds  »  I  B„.  (p)  D  _ 

K.  K  P  dK  V+P  ,U 

and  next  compute  D  from 

v,* 

D  -I  B0„(r )  D 
u,p  x  2K  v,w+T 

This  property  is  known  as  the  decomposition  thereom.  When  the  normal  equation  (2.7)  are  solved,  the  spline 
S(t,s)  can  be  computed  from  equation  (2.3a). 

3.  FREQUENCY  ANALYSIS 

In  this  chapter  an  expression  is  found  for  the  Fourier  transform  of  the  spline  S(t)  in  terms  of  the  Fourier 
transform  of  X(t)  for  the  least  squares  approximation. 

Analogue  expressions  can  be  found  for  the  two-dimensional  case.  For  the  frequency  analysis  we  consider  X ( t )  to 
be  defined  on  «•  <t<  “. 

The  Fourier  Transform  of  a  function  X(t)  is  denoted  by 
FT(X)  =  x(f)  8  /X(t)  e2niftdt 
and  conversely 

X(t)  »  /x(f)  e+2niftdf 

Since 

FT(B1)  =  sin(irf)/irf  8  sinc(f), 
one  has  fr!>m  (2.1) 

FTtB^  «  sinck(f). 

Sampling  of  a  function  X(t)(which  will  be  denoted  by  X*(t))  at  integer  values  amounts  to  multiplication  with 
the  Dirac  comb  III(t)  defined  by  the  series 

Ill(t)  -  £«( t-v), 

y 

where  4(t)  is  the  Dirac  delta  function.  III(t)  has  the  curious  property  that  it  is  its  own  Fourier  Transform 
FT(III)=  111(f)  (3.2) 

The  convolution  of  two  functions  X(t)  and  Y(t)  is  denoted  by  the  asterisk 
( X*Y ) ( t )  =  /  X(s)r(t-s)ds, 
and  by  the  convolution  theorem  one  has 

FT(X*Y)  »  x(f)y(f)  (3.3) 

Using  (3.2)  and  (3.3)  one  has 

FT(X.III)  =  x(f)  *  111(f)  »  I  x  (f-n)  (3. It) 

n 

which  co-incides  with  the  periodic  repetition  of  the  spectrum  x(f). 

If  X(t)  is  band-limited  to  the  Nyquist  range  -i<f< J ,  the  replications  do  not  overlap. 

From  the  previous  discussion  it  follows  that  the  process  in  determining  the  approximating  spline  S(t)  in 
the  least  squares  sense,  can  be  described  with  the  following  system  configuration. 


Fig.  3.1  Dswmination  of  the  lent  squire  approximation 
The  symbol  between  P(t)  and  P*(t)  stands  for  the  sampling  mechanism. 

In  the  sequel  the  transfer  function  of  each  system  component  is  analyzed  separately  and,  by  applying  the  convo 
lution  theorem,  the  relation  between  s(f)  and  x(f)  is  obtained. 

a.  Convolution  and  sampling 
The  relation  between  X(t)  and  P(t)  is  given  by 
P(t)  -  (X*Bk)(t) 

With  help  of  the  convolution  theorem  and  (3.1)  it  follows 
p(f)  -  x(f).  bk(f) 

«  x(f ) .  sinck( f ) 
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For  k«3,  the  function  sinc^(f)  is  plotted  in  figure  A. 2. 
Next,  applying  the  sampling  theorem  gives 


p*(f)  ■  jj  x(f-n)  sinck(f-n) 


b.  Solving  the  normal  equations 

The  relation  between  the  sampled  convolution  P*(t)  and  sampled  B-spline  coefficients  C*(t)  is  given  by 
P*(t)  -  (B^  *  C*)(t) 

Applying  the  convolution  theorem  from  equation  (2.U)  gives 
-  b^(f)  c*(f) 

1  jfc  /  _  \  n  1 . 

■  p  lf  1  ■  ^  ■ 


p*(f> 

Thus 

c*(f) 


For  k»1 ,2  and  3  the  inverse  transfer  function  d^(  f )  of  the  deconvolution  process  is  given  explicitly  below. 
d,(f)  •  1 

d2(f)  »  1-  2/3  sin2  trf 

dj(f)  =  1-  sin2  nf  +  2/15  sin**  irf 

In  general  the  periodic  function  d^(f)  can  be  computed  from  (using  eq.  (3.7)). 
dk(f)  -  J  B2k(„)  5i2*nf 

The  actual  relation  between  P*(t)  and  C*(t)  is  determined  by  the  chosen  implementation  method,  see  Appendix  A. 
In  case  the  Cholesky  decomposition  method  is  used,  the  relation  between  p*(f)  and  c*(f)  almost  equals  the 
ideal  relation  (3.8). 

The  periodic  transfer  function  d3_  (f)  is  plotted  in  figure  A3.  When  the  local  deconvolution  process  is  used 
conform  equation  (A. 3),  the  relation  between  P*(t)  and  C*(t)  is  given  by 


C  (t)  =  I  „  a  P*(t-n)  =  (A*  *  P*)  (t) 
n»-M  n 

The  given  values  an  coincide  with  a  given  sampled  function  A*(t).  In  the  frequency  domain  this  results  in 

c*(f)  =  a*(f)  .  p*(f)  (3.9) 

From  relations  (3-9)  and  (3.6)  the  values  an  may  be  derived  by  demanding 
a*(f)  dk(f)  »  1 

This  Finite  Impuls  Response  (FIR)  design  problem  can  be  treated  in  different  ways,  see  f.i.  (Rabiner,  L.R. ’75) . 
c.  Reconstruction 

The  relation  between  C*(t)  and  S(t)  is  given  by 
S(t)  =  (C*  *  Bk)(t) 
or  in  the  frequency  domain 

s(f)  »  sinck(f)  .  c*(f)  (3.10) 

So,  the  transfer  function  of  the  reconstruction  process  equals  the  transfer  function  of  the  convolution  process 
before  sampling. 

From  the  frequency  characteristics  of  each  of  the  system  components  the  relation  between  x(f)  and  s(f)  can 
easily  be  derived,  this  results  in 

s(f)  =  Z  TQ(f)  x  (f-n)  (3.11) 

The  functions  T^( f )  are  for  the  exact  deconvolution  given  by 


T  (f)  „  sinck(f)  sincNf-n) 

n  I  sinc2k(f-m) 

m 

and  for  the  local  deconvolution  by 


(3.12) 


T  (f)  »  a(f)  .  sinck(f)  sinck(f-n) 
n 

a(f)  »  I  a.  e~l2lrjf 
j»  -M  J 


herein  expression  (3.1)  is  used  for  tk(f). 

In  the  sequaJ.  the  local  deconvolution  process  is  not  further  analysed. 

An  interpretation  of  (3.11)  is  as  follows.  The  frequency  response  s(f)  consists  of  a  proper  response  as  from  a 
linear  system  (T  ( f ) )  and  residual  responses  from  a  multiple  displaced  inputspectrum  (T  (f);  ni<o);  this  latter 
effect  will  be  called  aliasing.  If  x(f)  is  band  limited  to  the  Nyquist  range,  s(f)  will  Contain  no  aliasing 
into  the  Nyquist  range,  but  it  will  contain  attenuated  replications  of  the  band  limited  function  x(f),  which 
are  translated  outside  the  Nyquist  range  (equation  3.11). 

Assuming  a  more  or  less  flat  input  spectrum  a  good  measure  for  the  aliasing  effect  is  the  function  T,(f) 

A 


TA(f)  ^(f) 


(3.13) 


«  T  ( f )  ( 1  — T  ( f ) ) 
o  o 


There  is  no  aliasing  where  T0  is  either  unity  or  zero. 

In  figure  At  the  response  proper  and  aliasing  measure  T,  are  shown  for  the  case  k»3  using  formulae  (3.12)  and 
(3.13).  A 
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In  the  table  below  the  3  db  power  cut  off  or  70  t  amplitude  response  is  given  with  respect  to  T  (f)  and  some 
values  of  TA  (f),  for  several  orders  K  of  the  B-spline. 

From  figure  A4  and  table  1,  one  can  see  that  TQ( f )  approaches  the  ideal  zonal  low  pass  filter  with  impulse 
response  sinc(t)  and  this  approach  becomes  better  when  the  order  of  the  spline  increases.  When  data  reduction 
is  achieved  using  ideal  zonal  low  pass  filtering,  all  the  computations  involved  i.e.  convolution  and  interpola¬ 
tion,  concern  infinite  series.  In  the  approach  proposed  here  only  the  deconvolution  process  consists  of  such  a 
computation,  but  this  has  to  be  done  once,  independent  of  the  number  of  interpolations.  Each  interpolation 
thereafter  is  easily  performed  using  a  few  B-spline  coefficients. 

TABLE  1 

Frequency  characteristics  of  the  spline  approximation 


k 

70  i  resp. 

T.(0.5> 

A 

ta{o.45) 

TA(0.4) 

Ta(0.3) 

1 

f=  0.324 

0.491 

0.499 

0?494 

0.440 

2 

f*  0.444 

0.5 

0.465 

0.378 

0.188 

3 

f=  0.465 

0.5 

0.421 

0.273 

0.079 

4 

f»  0.473 

0.5 

0.373 

0.190 

0.024 

5 

f=  0.479 

0.5 

0.323 

0.129 

0.017 

6 

f-  0.483 

0.5 

0.275 

0.087 

0.009 

From  table  1  one  can  see  that  aliasing  decreases  rapidly  for  increasing  k,  however,  the  computation  on  the  con¬ 
volution  needs  more  effort  for  increasing  k. 


1*.  HARDWARE  IMPLEMENTATION  ASPECTS 


4.1  Introduction 

The  algorithm  for  image  processing  with  use  of  splines  as  described  in  the  previous  chapters  leads  to  a 
data  reduction  and  reconstruction  system  which  processes  large  amounts  of  image  data  with  a  high  input  rate. 
Because  of  this  high  data  rate,  real-time  processing  requires  special  effort  and  therefore  it  is  of  much  inter¬ 
est  to  know  which  part  should  be  performed  in  real-time. 

Because  data  compression  is  required  for  reduction  of  transmission  bandwith  or  recording  capacity  on  board, 
and  because  the  input  data  stream  is  continuous,  this  should  be  an  on-line  real-time  process.  However,  for  the 
data  reconstruction  proce33  there  are  three  options:  it  can  be  performed  on-line  and  in  real-time,  off-line  and 
not  in  real-time  or  off-line  and  in  real-time  depending  on  the  application. 

If  both  real-time  reduction  and  reconstruction  are  required  the  high  data  rates  involved  require  the 
development  of  two  special-purpose  processors:  on  for  on-board  data  reduction  and  another  for  image  reconstruc¬ 
tion  on  the  ground. 

If  off-line  real-time  reconstruction  is  acceptable,  the  presented  data  reduction/reconstruction  method  is 
of  special  interest  because  the  off-line  real-time  reconstruction  of  the  image  can  be  divided  in  a  non-real¬ 
time  preprocessing  step  (deconvolution),  and  a  real-time  replay  (interpolation)  step.  The  interpolation  process 
lends  itself  well  to  implementation  on  a  special-purpose  processor,  whereas  deconvolution,  which  is  more  com¬ 
plicated,  can  be  programmed  on  a  general-purpose  computer.  In  case  of  an  off-line  non-real-time  reconstruction, 
both  deconvolution  and  interpolation  should  be  implemented  on  a  general-purpose  computer. 

In  this  chapter,  we  further  concentrate  on  the  implementation  of  the  real-time,  on-board  data  reduction 
by  a  special-purpose  processor,  because  this  part  of  the  processing  is  mo3t  critical  concerning  volume,  power 
consumption  etc ,  and  is  functionally  identical  for  all  applications . 

All  implementation  structures  given  in  this  paper  are  proposals,  so  they  have  not  been  realized  and  tested 
in  practice. 


4.2  The  algorithm 


The  algorithm  consists  of  three  parts  as  was  shown  in  chapter  3. 

Solely  the  first  part  (the  correlation  process)  is  an  on-board  process,  yielding  the  decimated  data-stream 

.  Once  on  the  ground,  ^  is  first  converted  into  D  and  then  interpolated,  which  results  in  the  recon¬ 


structed  image  S(s,t).  In  practice  the  image  X(s,t)  will  be  sensed  with  a  scanner  giving  a  representation  which 
is  continuous  in  scan  direction  and  discrete  in  cross-scan  direction.  Likewise,  S(s,t)  is  imaged  with  a  scan-by¬ 
scan  method.  Therefore,  X  (t)  and  S  (t)  will  be  used  as  descriptors  hereafter:  n  is  the  line  number, 
n  n 


4.3  Implementation  of  one-dimtnsional  spline  correlator 


4.3.1  Introduction 

As  indicated  in  2.2  the  two-dimensional  spline-correlation  can  be  constituted  from  two  one-dimensional 
correlations.  A  simple  implementation  form  using  this  property  is  made  with  two  one-dimensional  convolution 
filters  with  a  buffering  system  in-between.  First  the  information  is  convolved  in  scan-direction,  giving  deci¬ 
mated  data  to  the  line-buffering,  which  is  needed  for  the  second,  cross-scan  convolution.  The  line-filtering 
can  be  implemented  aB  a  continuous  time  filter  as  well  as  a  discrete  time  filter,  depending  on  the  way  the  line- 
information  is  represented.  The  cross-scan  filter  is  always  a  discrete  time  filter. 

In  the  following  sections  a  number  of  implementations  of  one-dimensional  B-spline  correlators  will  be  out¬ 
lined.  Section  4.4  gives  an  indication  of  the  two-dimensional  filter. 

In  the  following  it  is  assumed  that  the  order  of  the  splines  (k)  is  3.  Other  orders  result  in  similar  imple- 
men tat ions . 


4.3.2  One-dimensional  analog  correlator 

Essentially,  the  analog  B-spline  correlator  can  be  made  as  a  linear,  time-invariant  analog  filter,  of 
which  the  impulse  response  has  the  form  of  the  (symmetrical)  B-spline  (see  Fig.  4.1),  followed  by  a  sas$>ler 


□ 


Fig.  4.1  B;j-«pline  function 


This  approach  is  not  leading  to  a  simple  solution,  because  a  finite-duration,  symmetrical  impulse  response 
is  required.  However,  it  is  possible  to  make  use  of  the  special  properties  of  the  B-spline  correlation  integral, 
which  will  lead  to  a  hybrid  solution  consisting  of  a  combined  analog  and  digital  filter.  The  operation  to  be 
performed  is  (for  correlation  of  line  information) 


B.(t-u)  X  (t)  dt 
j  n 


n  *  line  number 

t  «  continuous  variable  in  line  direction 
u  ■  correlation  number  (integer) 

( note  the  sample-interval  scaling  to  1 ) . 

The  values  P  are  in  fact  the  samples  of 
n,u  r 

at 

Pn(-t)»  /  ( s-T )  Xn(s)  ds 

on  integer  values  of  t. 

In  order  to  make  a  design  for  the  analog  correlator  first  a  method  is  found  for  computing  P  (t),  which 
is  done  by  means  of  Fourier  Transforms.  n 

The  Fourier  transform  of  PQ(t)  is  (using  (3.1)  and  (3-3)) 

P  (f)  ■«  sinc3(f )  X  (f) 

*  [i2  sin  (irf)]3  - - -y 

(i2wf)3 

The  function  X^(f  )/(i2irf  )^  is  the  Fourier  transform  of  the  third  order  primitive  of  Xr ( t ) ,  denoted  by  Xq 

The  function  i2  sin  (irf)  is  the  Fourier  transform  of  6(t+J)  -  5(t-J).  Using  the  convolution  theorem  (3,3)  we  can 

[-3] 

express  Pq(t)  in  terms  of  Xl”J,l(t): 

Pn(t)*  (Xnt_3](s)  .  [«<s+j)  -  «(s-J)J*3)(t) 

»  X  C-3](t+3/2)-3  X  C-3l( t+J)  +  3X  C-3J(t-J)  -  X  [-33(t-3/2) 
n  n  n  n 

Thus  the  samples  of  P  (t)  can  be  written  as 
n,u  n 

Pn,u“  xnt"3J(,J+3/2)  '  3Xn['3l(u+i)  +  3XnC"3l(ii-i)  -  xj'33 (u-3/2) 

Accepting  a  time  delay  of  j,  we  obtain  the  expression 

Pn  u+  2  x  Xt-3](u+3)-3  X3_33(m+2)+3  X3_3l(u+l)-xj;~3l(u) 
n,u+  2  n  n  n  n 

This  leads  to  the  realisation  of  figure  k.2. 


Fig.  4.2  Structure  analog  B-apllna  correlator 


This  realisation  is  built-up  with  a  series  of  three  integrators,  serving  as  an  analog  presampling  filter,  a 
sampler,  and  a  very  simple  digital  filtering  operation. 

A  drawback  of  this  realisation  is  the  fact  that  the  integrators  will  show  overflow  after  some  time.  Because 
the  information  is  processed  line  after  line  the  resulting  harm  may  be  kept  within  bound*  but  anyway  the  large 
dynamic  range  needed  for  the  integrators  and  thus  for  the  analog- to-digital  conversion,  is  still  a  disadvantage. 
Currently,  it  is  investigated  at  HIB  whether  a  compensation  for  the  overflow  by  means  of  a  controlled  input 
offset  is  possible. 
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The  processing  capacity  of  this  realization,  that  is  the  maximal  number  of  correlations  that  can  be  computed,  is 
determined  by  the  subtraction  time.  In  case  three  parallel  subtractions  are  used  vith  a  cycle  time  of  100  ns, 
approx.  10?  correlations  per  second  are  computed. 

When  the  dynamic-range  problems  of  the  integrators  are  set  aside,  this  hybrid  solution  is  attractive  because 
of  its  relative  simplicity  and  the  large  processing  capacity.  Moreover  the  delivered  correlation  coefficients 

P  are  basically  exact,  as  opposed  to  the  output  of  the  full  digital  realizations,  as  described  hereunder, 
n ,  u 

4.3.3  One-dimensional  digital  correlator 

The  B-spline  correlator  may  also  be  realised  as  a  discrete  time  filter  vith  a  sample  stream  as  input.  In 
that  case  the  real  correlation  in  scan  direction  is  approximated  by: 


sampling  :  X 

n  ,m 

discrete  correlation:  P 

n.M 


l  Vr 


m) 


Xn<7> 


This  approximation  is  attained  by  sampling  the  input  signal  vith  sampling  frequency  r,  and  correlating  it 
vith  a  sequence  obtained  by  sampling  B,(t)  frequency  r.  In  this  vay  an  FIR-filter  (finite-duration  impulse  res¬ 
ponse  filter)  is  generated,  serving  as~a  decimation  filter  vith  reduction  factor  r.  The  sampling  frequency 
must  be  sufficiently  high  in  order  to  limit  aliasing  in  input  signal  as  veil  as  in  the  B-spline.  It  is  possi¬ 
ble  to  avoid  aliasing  in  the  input  signal  by  inserting  a  presampling  filter,  but  this  vill  produce  additional 
linear  distortion.  In  practice  the  aliasing  error  in  the  representation  of  the  B-spline  as  a  series  of  samples 
is  acceptable  for  r  >  4. 

In  cross-track  direction  there  is  no  choice  betveen  continuous  and  discrete  correlation:  because  of  the 
line-scanning  the  information  to  be  correlated  is  alvays  discrete.  The  corresponding  operation  is  here: 


v.M 


Z  B  (£  -  v)  P(*>) 
k  3  r  r 


vhere  P(— ,p )  *  P. 

r  k.,u 

The  discrete  implementation  methods  described  here  are  vorked  out  for  a  reduction  factor  (r)  equal  to  4. 
Then,  the  operation  to  be  implemented  takes  the  form: 

F4P  =  I1  aix4t)-i  .  P  is  an  integer. 
i=Q 

This  corresponds  to  a  vector  multiplication,  of  a  fixed  coefficients  vector  vith  a  variable  sample  vector, 
built-up  vith  12  consecutive  samples. 

The  12  coefficients  a.  are  listed  in  table  I.  An  interesting  fact  is  that  the  binary  representation  as  a 
fixed  point  fraction  needs  only  7  bits,  vithout  rounding. 

Further,  it  is  assumed  that  the  input  samples  are  quantized  in  8  bits. 


TABLE  I 

The  correlation  coefficients  a^, 
for  third  order  spline 
(k»  3)  and  reduction  factor  r  =  4. 


i 

a^(dec) 

a^(bin) 

0 

0,001953125 

0000001 

1 

0,017578125 

0001001 

2 

0,048828125 

0011001 

3 

0,095703125 

0110001 

4 

0,15234375 

1001110 

5 

0,18359375 

1011110 

6 

0,18359375 

1011110 

7 

0,15234375 

1001110 

8 

0,095703125 

0110001 

9 

0,048828125 

0011001 

10 

0,017578125 

0001001 

11 

0,001953125 

0000001 

Fig.  4.3  Structure  digital  B-ap!ine  correlator  with  multiplier  end 
coefficients  ROM 


4. 3.3.1  Method  1 .  Application  of  multiplier  and  coefficient  ROM 

An  obvious  implementation  method  uses  a  hardvare  multiplier  and  a  read— only-memory  (ROM)  for  storing  the 
coefficients  a^.  A  proper  sturcture  (see  fig.  4.3)  is  obtained  by  immediately  calculating  all  relevant  contri¬ 
butions  to  the  vector  product  series,  for  every  nev  input  sample;  so,  for  k»  3,  there  are  3  multiplications 
and  3  additions  at  a  time.  The  vector  products  arise  as  accumulations  of  contributions  from  12  input  samples 
each. 

In  this  vay,  only  three  intermediate  results  have  to  be  stored. 

This  is  of  special  interest  for  the  processing  in  cross-scan  direction,  see  4.4.1. 

The  coefficients  ROM  is  Just  small,  12x7  bits.  The  adder  and  the  registers  must  be  at  least  17  bits  vide, 
if  full  length  calculations  are  required  vithout  round-off  error.  In  that  case,  a  processing  capacity  of  approx. 
350.000  correlationa/s,  vith  a  cycle  time  of  150  ns,  is  easily  attainable  vith  commercially  available  compo¬ 
nents. 


L. 
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The  structure  of  this  implementation  depends  on  the  reduction  factor  required.  Rote,  however,  that  the 
processing  capacity,  in  correlations/s,  does  not  depend  on  this  reduction  factor. 

4.33.2  Method  2.  Application  of  special  vector  multiplier 

Because  the  operation  to  be  performed  is  a  vector  multiplication  of  a  variable  (sample)  vector  with  a 
fixed  (coefficients)  vector,  the  vector  multiplier  as  first  described  by  Peled  and  Liu,  and  Little  (Med,  A,’T4and 
Little,  W.D.,'7**)  can  be  applied.  Its  principles  are  as  follows. 


M-l 

The  operation  :  y  ■  I  a.x  , 

n  i-0 

M-1  N  : 

may  be  written  as:  y„  =  I  a.  I  x  .  .2  J 

"  i-0  1  J-1 


when  the  x  .  are  N-bits  positive  fractiofas. 

•  n-i 

Definition  of  C  .  and  interchanging  of  the  order  of  the  summations  give 
n$  J 


N 

£ 

j-' 


M-l 

C  .»  £  a.  x  .  . 

“J  i-0  t  n-i.O 

The  coefficients  C  .  can  assume  just  21*  different  values,  which  can  be  calculated  in  advance  and 
n ,  j 

stored  in  a  ROM.  Instead  of  being  calculated  each  time,  they  now  can  be  selected  by  means  of  the  set  x  .  . 

n— i  ,j 

which  are  the  jth  bits  of  the  M-values  x  ..  Next,  the  selected  coefficients  C  .  are  summed  in  a  shifting 

n-i  n,j 

accumulator,  i.e.  the  intermediate  results  are  shifted  over  1  bit  after  each  addition.  Figure  4.4  gives  the 
structure  of  this  method. 

A  disadvantage  of  this  method  may  be  the  large  ROM  needed  for  the  coefficients  C  . .  In  case  a  high 
processing  speed  is  required,  it  must  be  built-up  with  bipolar  devices.  An  alternative1 structure  is  obtained  by 
making  use  of  the  symmetry  in  the  series  a^:  a  =  a1 1 ,  a^  a  and  so  on.  The  operation: 


can  therefore  be  written  as: 


yn  »  E 


+  £ 
II 


where  the  sets  a.  are  identical  for  both  I  and  II.  The  summations  I  and  II  can  then  be  calculated  separately, 
making  use  of  th&  same  ROM;  its  size  is  now  reduced  to  64  x  9  bits.  Finally  both  partial  vector  products  must 
be  summed. 

In  order  to  adapt  the  vector-multiplier  structure  to  the  specific  task  of  fast  spline  correlation  it 
should  be  modified  such  that  it  can  cope  with  a  continuously  running  input  data  stream.  This  is  accomplished  by 
inserting  registers  between  the  arithmetic  elements,  which  results  in  a  pipeline  structure,  and  by  modifying 
the  input  register-file  structure,  so  that  the  computation  is  repeated  every  time  four  new  samples  are  received. 

The  processing  speed  attained  with  100  ns  cycle  time  is  approc.  1.2  million  correlations  per  second.  The 
speed  is  halved  in  the  alternative  structure,  with  ROM  and  accumulator  multiplexed. 

The  structure,  again,  depends  on  the  reduction  factor,  in  contrast  with  the  upper  bound  for  the  number  of 
correlations  per  second. 


4. 3. 3. 3  Method  3.  Application  of  accumulators,  based  on  properties  of  the  B-spline 

In  contrast  with  both  previous  methods,  the  accumulator  method  described  here  is  limited  to  application  to 
B-spline  correlation.  The  method  has  a  strong  relationship  with  the  integrator  method  for  analog  correlation 
(see  5.3.1). 

The  operation  to  be  performed  is: 
y4P  "  -1)  ai  X4p-i  *  p  inte8er • 

When  we  overlook  for  a  while  that  just  one  to  four  correlations  have  to  be  calculated,  application  of  the  z- 
transform  yields: 

Y(z)  -  X(Z)  £  a.  z"1  . 

Hence,  the  transfer  function  H(z)  -  £  a^  z“- 


-  ^  O  ♦  9z_1  +  25  z“2  + 
Factorization  yields: 

H(z)  »  5^2  +  Z_1 

-  —  (1-Z~-)3 
"  512  ' i-*-l' 


49  z“3  +  78z-1*  +  94  z"*  +  94z”^ 

♦  z-2  +  z*3)3(l+€z_1  +  z-2) 

(1  +  6*_1  +  z-2). 


+  78~7  +  k9z~6  +  25z-9  +  9z~10  +  z’1 


). 
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This  expression  lends  itself  to  the  simple  implementation  as  shown  in  figure  4.5,  which  consists  of  mere  addi¬ 
tions,  subtractions  and  registers,  and  one  multiplication  with  a  factor  6. 

This  multiplication  is,  however,  easily  accomplished  with  one  additional  adder. 

Apparently,  there  could  be  problems  due  to  overflowing  accumulators,  but  this  is  not  the  case  because 
modulo  21'  arithmetic  is  applied.  This  is  possible  because  the  range  of  the  resulting  values  is  known.  The 
range  determines  the  number  of  bits  involved  in  an  addition  or  subtraction,  which  influences  the  end  result. 
Therefore,  overflow  in  the  intermediate  results  is  of  no  importance.  When  8  bits  samples  are  used,  all  computa¬ 
tions  are  done  in  17  bits. 

For  a  practical  implementation  according  to  this  method.it  is  taken  into  account  that  just  one  to  four  correla¬ 
tions  have  to  be  calculated.  Hence  a  number  of  additions,  subtractions  and  registers  in  the  final  part  of  the 
structure  can  be  deleted,  resulting  in  a  total  of  5  adders/subtractors  and  8  registers,  and  addition  of  a  simple 
control  structure. 

The  processing  speed,  attainable  with  commercially  available  hardware  is  approx.  2.5  million  correlations/s, 
with  a  100  ns  cycle  time.  This  number  decreases  with  higher  reduction  factors. 


4.4  Implementation  of  two-dimensional  spline  correlator 


4.4.1  Introduction 

The  two-dimensional  spline  correlation  consists  of  two  one-dimensional  correlators  (Fig.  4.6),  of  which- 
the  first  works  in  scan  direction,  and  the  second  in  cross-scan  direction.  The  first  correlator  processes  the 
line  information,  line-after-line;  the  second  one  operates  simultaneously  on  n^  correlations  (n^  is  reduced 

number  of  pixels  or  a  line).  This  parallelism  necessitates  a  form  of  intermediate  buffering. 

The  size  of  the  intermediate  buffering  depends  on  the  implementation  method  chosen  for  the  second  correla¬ 
tor;  further  on  the  reduced  number  of  pixels  per  line.  The  relation  is: 

nl 

M  =  —  x  R.  M  =  buffer  size  in  words 
rl  n  =  number  of  pixels/line 

r^=  reduction  factor  in  line  direction 

R  =  number  of  registers  in  one-dimensional  implementation. 

The  choice  for  the  second  correlator  is  therefore  mainly  determined  by  the  number  of  registers.  There  is  no 
relation  between  the  size  of  the  intermediate  buffer  and  the  chosen  implementation  method  for  the  first  corre¬ 
lator;  hence  this  choice  depends  on  more  general  requirements. 

4.4.2  Analog  correlation  in  line  direction  and  digital  correlation  in  cross  direction 

If  represented  as  an  analog  signal  the  information  in  line  direction  can  be  correlated  by  the  analog 
method.  The  output  of  this  line  filter  is  a  sequence  of  numbers  for  every  line.  Next,  the  sequences  are  rear¬ 
ranged  in  order  to  be  processed  in  cross-scan  direction,  in  the  second  correlator. 

For  the  second  correlator,  the  digital  implementation  is  chosen  with  a  minimum  number  of  registers,  resulting 
in  method  1,  section  4. 3. 3.1. 

Assuming  that  only  the  digital  correlation  determines  the  upper  bound  for  the  processing  speed  that  can 
be  obtained,  the  maximum  output  data  rate  is  550. 103  pixels/s. 


4.4.3  Digital  correlation  in  both  line  and  cross  directions 

If  the  image  information  to  be  processed  has  been  sampled  before,  or  if  full  digital  processing  is  pre¬ 
ferred,  both  correlators  are  digital.  For  the  second  correlator  (cross  direction)  there  is  no  difference  as 
compared  with  the  previous  section.  The  choice  for  the  first  correlator,  however,  is  not  mainly  determined  by 
the  number  of  registers,  but  for  instance  by  the  complexity  and  amount  of  arithmetic  elements. 

For  the  first  correlator  mainly  the  methods  1  and  2  can  be  chosen.  Method  2  is  only  considered  if  the 
large  scale  integrated  multiplier-I.C's  can  not  be  applied,  for  instance  because  of  the  high  expense. 

Method  1  is  especially  attractive  in  case  of  moderate  requirements  on  the  processing  speed,  so  that  the 
first  correlator  can  be  partly  combined  with  the  second  one,  which  is  also  implemented  according  to  method  1. 
Especially  the  arithmetic  elements  can  be  shared. 

The  highest  speed  is  attainable  with  method  3.  The  disadvantage  of  the  somewhat  higher  component  count  is 
alleviated  by  the  fact  that  mere  low-c  >st  standard  TTL-components  are  used. 


1 
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Fi,.4.5  Structure  accumulator  method  for  B^-iplin*  correlation 


Fig.  4.6  Structure  two-dimensional  B -spline  correlator 


5.  EDGE-ENHANCEMENT 

From  the  results  (see  chapter  6)  obtained  with  the  image  approximation  method  as  outlined  before,  one  can 
see  that  sharp  edge3  in  the  original  image  are  smeared  out  in  the  image  approximation.  For  some  purposes  this 
may  be  an  undesired  phenomenon.  In  order  to  suppress  this  smearing  effect,  an  edge  enhancement  algorithm  is 
designed.  The  idea  of  the  algorithm  is  described  with  help  of  the  figure  below.  It  is  assumed  that  the  measured 
radiance  from  the  sensor  is  quantized  in  8  bits.  The  radiance  levels,  0-255,  are  divided  now  in  2®  intervals 
(N  is  a  fixed  number  between  0  and  8).  In  figure  5 . 1 ,  U intervals  are  distinguished  with  equal  3ize.  However, 
these  sizes  may  also  be  taken  in  an  adaptive  way. 


In  applying  the  algorithm  on  the  input  signal  X(n,m),  two  data  streams  are  created.  Stream  1  contains  the 
information  about  the  interval  for  each  pixel,  the  second  stream  contains  the  relative  radiance  level  of  each 
pixel  within  the  interval,  as  is  shoved  in  figure  5.1  for  one  scanline.  For  the  example  in  figure  5.1  the  two 
data  streams  are  plotted  explicitly  in  the  figures  5.2  and  5.3. 

4 

3 
3 
I 

Fig.  S.2  Stream  1,  after  adga  filtering 


Fig.  5.3  Stream  2,  after  edge  filtering 


Data  stream  2  i3  in  fact  a  nonlinear  conversion  of  the  input  signal,  as  follows: 


14-11 
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COMPUTE 

RADIANCE 
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INTERPOLATION 

SO,*) 
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SO,*) 

EQUATIONS 

WITH  Bk 

H 

RADIANCE 

Fig.  5.4  Stream  2  versus  input  to  edge  fitter 

It  is  clear  that  it  is  possible  to  reconstruct  the  image  X(n,m)  from  stream  1  and  stream  2,  with  a  so- 
called  inverse  edge  filter.  After  applying  the  edge  filter,  stream  1  stays  unaltered  but  stream  2  is  seen  as  a 
new  image  (it  has  less  edges  than  the  image  X(n,m))  and  is  approximated  with  a  spline  S(n,m)  as  outlined  in 
chapter  2.  Finally,  stream  1  and  the  approximation  S(n,m)  are  added  in  the  inverse  edge  filter,  giving  the 
approximation  S(n,m)  to  the  image  X(n,m). 

Resuming,  the  edge  filter  is  used  in  the  following  system  configuration. 


CONVOLUTION 
WITH  fi, 


Fig.  5.5  Image  approximation  with  edge-enhancement 

By  taking  N  big  enough  (determining  the  number  of  intervals)  the  accuracy  of  S(n,m)  becomes  better,  but 
data  stream  2  consists  of  more  bit3  describing  the  intervals.  However,  for  3mall  N  these  bits  are  very  redundant 
and  therefor  can  be  encoded  such  that  less  bits  are  required  for  transmission  c.q.  storage  of  stream  1. 

6.  SIMULATION  RESULTS  FOR  LANDSAT  DATA 

6.1  Preliminary  results  of  approximation  with  splines  for  WNDSAT  data 

The  simulation  results  described  in  this  chapter  have  been  obtained  by  combining  computer  programs  to 
process  part  of  a  LANDSAT  Computer  Compatible  Tape  (CCT)  of  real  remote  3en3ing  data.  The  area  chosen  is  around 
the  town  of  Harderwijk,  containing  remnants  of  Lake  IJssel,  regular  linear  structures  in  the  newly  reclaimed 
Lake  IJ33el  polders  and  irregular  fields  of  a  rich  structure. 

Consequently  high  3patial  frequencies  are  present  in  the  data.  The  dimensions  are  240  x  240  pixels.  The  results 
shown  (photo  1-3)  are  of  the  original  data,  a  least-squares  parabolic  spline  approximation  and  an  approximation 
obtained  with  averaging  2x3  pixels.  The  spline  intervals  are  also  3  pixels  in  horizontal  lines  and  2  pixels 
in  vertical  lines.  Each  surface  is  neglecting  boundary  effects,  described  by  80  x  120  coefficients  for  each 
colour  (resulting  in  a  data-reduction  of  a  factor  six). 

Computing  the  approximating  image  required  about  2T.5  CPU  seconds  on  the  CYBER  72  at  NLR.  About  2.5  CPU 
seconds  were  needed  for  solving  the  normal  equations  (deconvolution)  using  the  Cholesky  method  and  12.5  seconds 
for  both  the  computation  of  the  B-spline  correlation  data  and  the  reconstruction  of  the  image.  As  was  shown  in 
chapter  4 ,  a  large  gain  in  processing  3peed  is  attained  with  a  special  purpose  processor  for  the  correlation 
process.  The  same  result  is  expected  for  the  deconvolution  and  interpolation  processes. 

Comparing  photo  1  and  photo  2,  it  is  clear  that  a  number  of  features  in  the  original  image  have  been 
smeared  out  or  reappear  at  displaced  positions  in  the  reconstructed  image.  This  was  expected  due  to  the  Low 
pass  filter  characteristics  of  this  approximation  method  resulting  in  a  decrease  of  spatial  resolution.  However, 
down  to  a  fair  level  of  detail  the  overall  structure  is  reproduced  better  than  the  approximation  obtained  with 
averaging  2x3  pixels  (photo  3). 

One  can  see  that  especially  edges  from  the  original  image  are  smeared  out  in  the  approximation.  Therefore,  in 
chapter  5  an  edge  enhancement  algorithm  wa3  designed  in  order  to  have  a  better  edge  reproduction  in  the  approx¬ 
imation.  In  the  next  section  some  results  are  given  after  application  of  this  edge  enhancement  algorithm  in 
combination  with  the  least-squares  approximation  method. 

6.2  Preliminary  results  of  approximation  with  splines  and  edge  enhancement  for  Landsat  data 

Using  the  system  configuration  of  figure  5.5, a  B-spline  approximation  with  edge  enhancement  as  outlined 
before,  is  applied  on  an  image  of  the  area  around  the  town  of  Harderwijk.  The  result  is  shown  on  photo  4 .  The 
value  of  2^,  denoting  the  number  of  intervals  in  the  radiance  levels  is  chosen  to  be  4  for  the  green  colour  and 
8  for  the  red  colour.  The  four  intervals  are  bounded  by  the  points  ( i— 1 ) x  64 -£,  i»1(l)  5  and  the  eight  intervals 
are  bounded  by  the  points  (i-l)x  32-J ,  i=l(l)  9. 

The  difference  in  intervals  is  due  to  the  fact  that  in  the  red  colour  the  most  significant  bit  in  the  original 
image  is  not  varying. 

The  choice  of  the  boundary  points  is  independent  from  the  image  to  be  approximated,  so  better  results  may  be 
expected  when  these  points  are  chosen  in  an  adaptive  way. 

The  spline  intervals  are  3  pixels  in  horizontal  lines  and  2  pixels  in  vertical  lines.  The  overhead  information 
from  the  edge  filter  (stream  1)  is  about  1  bit/pixel  for  each  colour.  Thus,  the  total  data  reduction  achieved 
is  a  factor  4 

Comparing  photo  1  and  photo  4  one  can  see  that  a  big  part  of  the  edges  present  in  the  original  image  sure 
enhanced  in  the  approximation.  However,  there  are  also  vague  areas  in  the  approximation,  it  seems  that  these 
areas  are  copies  from  the  approximation  using  only  B-splines  (photo  2).  The  reason  for  their  appearance  is 
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that  the  number  N  vas  to  low  for  their  detection  or/and  the  boundary  points  are  not  taken  optimal. 

Another  remarkable  phenomenon  is  the  occurrance  of  isolated  dark  and  light  coloured  pixels  photo  4  only.  This 
is  a  result  of  application  of  the  edge  enhancement  algorithm  when  an  edge  in  the  original  image  is  not  precisely 
divided  over  the  two  intervals  which  are  defined  by  the  boundary  points.  However,  when  the  boundary  points  are 
taken  optimal  this  phenomenon  will  be  less  worse. 


7 .  CONCLUSIONS 

1.  Least-squares  image  approximation  using  splines  is  a  good  method  for  real-time  data  reduction  on-board, 
followed  by  an  image-reconstruction  on  the  ground. 

2.  The  method  is  a  special  case  of  low-pass  fittering  and  therefore  results  in  a  decrease  of  spatial  resolu¬ 
tion.  When  the  order  of  the  spline  increases,  the  method  converges  to  the  ideal  tonal  low-pass  filter  and 
the  aliasing  effects  disappear. 

3.  For  the  data  reduction  step  it  is  possible  to  develop  a  special-purpose  processor  for  real-time  application 
up  to  an  output  data  rate  of  550. 103  pixels/s. 

U.  The  main  computational  effort  in  the  data  processing  on  the  ground  lies  in  the  preprocessing  of  the  com¬ 
pressed  data.  The  image  interpolation  from  the  modified  compressed  data  is  relatively  simple. 

5.  The  method  is  especially  tuned  to  applications  with  simple  data  reduction  hardware  on-board  and  a  more 
extensive  data  processing  on  the  ground. 

6.  Real-time  reconstruction  of  the  image  is  difficult  due  to  the  complex  preprocessing  of  the  compressed  data. 
However,  by  simplifying  or  deleting  the  preprocessing,  it  is  possible  to  have  a  Quick-look  facility 
offering  lower  image  quality. 

7.  The  proposed  edge-enhancement  algorithm,  which  adds  some  extra  information  to  the  compressed  data  set,  can 
improve  the  image  quality  considerably. 
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APPENDIX  A,  SOLVING  THE  NORMAL  EQUATIONS 


In  approximating  the  input  function  X(t)  on  the  interval  [0,T]  with  a  spline  S(t)  of  order  k  in  the  least 
squares  sense,  one  has  to  solve  the  normal  equations  (2.6).  In  vector  notation,  one  has  to  solve  the  linear 
problem. 

Ac  =  £  (A. 1 ) 

The  square  matrix  A  consists  of  zeros  except  on  the  main  diagonal  and  the  2k-2  upper-  and  lower  diagonals.  The 
vector  £  denotes  the  unknown  B-spline  coefficients  {CL}  and  the  known  vector  p  denotes  the  left  part  of  equa¬ 
tion  (2.6). 

For  the  case 
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(A.2) 


Finding  the  exact  solution  £ 
tion  of  the  exact  solution  £ 
tions  a  compromise  has  to  be 
the  approximation. 


of  (A. 1)  is  a  time  consuming  effort,  however,  for  some  applications  an  approxima- 
may  be  acceptable,  f.i.  for  use  in  a  Quick  Look  facility.  For  real  time  applica- 
found  between  the  quality  of  the  approximation  and  the  time  needed  for  computing 


14-13 


One  way  of  approximating  the  solution  to  equation  ( A . 1 )  is  computing  an  approximation  {C^}  of  the  form 

C*  »  E  a  P  (A. 3) 

*  m-  -M  “  V‘“ 

In  this  approximation  one  is  inspired  by  the  fact  that  the  coefficients  b^  of  A  satisfy  the  decay  rate 
bound. 

b. .  <  const.  X  ^  ^ 

i  if  < 1 

The  value  of  X  depends  upon  the  order  k  of  the  B-spline.  In  chapter  3  this  type  of  approximation  is  also  treated 
in  the  frequency  domain. 

In  finding  the  almost  exact  solution  of  (A. 1 )  an  efficient  method  is  found  faster  than  the  approximation 
(A. 3).  A  disadvantage  of  this  method  will  be  discussed  later.  The  method  is  based  upon  a  kind  of  Cholesky 
decomposition  of  the  matrix  A,  i.e.  one  can  write  the  matrix  A  of  (A.1)  as 

A  =  C  D  CT  (A-1*) 

only  for  the  last  (k-1)  rows  of  A  this  relation  is  not  valid,  but  this  is  neglected  in  the  beginning,  a  correc¬ 
tion  for  this  is  made  later  on. 

The  matrix  C  is  an  upper  triangular  matrix  with  non  zero  elements  on  the  main-  and  (k-1 )  upper  diagonals,  more¬ 
over  the  matrix  elements  on  the  same  diagonal  are  equal. 

The  matrix  D  is  the  Chronecker  sum  of  a  non-singular  (k-1)  x  (k-1)  matrix  and  the  identity  matrix. 

Thus  applying  the  decomposition  (A. 4)  one  can  easily  solve  (A.l)  using  recurrent  relation  yielding  the  solution 
c .  However,  the  composition  (A. 4)  is  not  unique,  it  should  be  chosen  such  that  the  linear  system  Cx“  b  can  be 
solved  in  a  stable  manner  using  recurrent  relations.  For  k=3,  the  decomposition  of  A  from  equation  (A. 2)  is 
given  below. 
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Finally  the  necessary  correction  should  be  made  on  the  solution  £  due  to  inaccurate  decomposition  (A.U).  From 
the  discussion  above  it  follows  that  the  effect  of  an  element  p  from  the  input  data  £  dies  out  exponentially 
in  the  solution  sequence  (Cv)  (controlled  by  the  matrix  C).  By  Remanding  a  prescribed  accuracy  in  the  solution 
(Cu)  one  can  determine  the  accurate  part  of  the  solution  £.  The  inaccurate  part  of  £  should  be  computed  again 

with  use  of  the  decomposition: 

A-  CT  D'  C 


where  the  matrix  D*  in  the  Chronecker  sum  of  the  identity  matrix  and  the  previous  non-singular  (k-1)  x  (k-1) 
matrix. 

A  disadvantage  of  the  decomposition  method  is  that  the  input  data  £  has  to  be  passed  two  times  in  opposite 
directions. 


Fig  A.1  Iso-metric  projection  plot  of  parabolic  B-sptirw 


Fig.  A.3  Plot  of  the  function  (l-sin^irf)  2/15  sin^Orf))'1 


Fig.  A.4  Graph  of  logarithms  of  the  proper  transfer  function  T0(f)(fat)  and 
the  aliasing  measure  Tn(f)(thin) 
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SUMMARY 


A  multiprocessor  system  for  Image  processing  Is  presented.  Structural  requl rrments  are 
shown  for  frequently  used  Image  processing  tasks.  Design  principles  and  Implementation 
techniques  to  realize  high  performance  capabilities  In  a  multiprocessor  system  are 
described  In  terms  of  simultaneous  processing,  structural  flexibility,  and  data 
Input/output.  The  FLIP  (Flexible  Image  Processing  system)  Is  operated  In  conjunction  with 
a  host  computer  and  can  perform  up  to  64  MIPS  (Million  Instructions  Per  Second).  FLIP  can 
be  considered  as  a  multi-pipeline-processor  comprising  16  individual  processors  and  a  high 
speed  data  Input/output  processor.  The  structural  flexibility  of  FLIP  Is  achieved  by  Its 
Individual  processors.  They  can  be  arranged  by  programming  In  nearly  any  manner  to  adapt 
the  hardware  to  an  optimal  processing  structure  of  the  function  to  be  processed.  FLIP 
achieves  simultaneous  processing  by  combining  parallel  processing  and  pipelining  of 
operations.  Since  sychronlzatlon  of  the  Individual  processors  Is  data-flow  controlled  the 
structural  behaviour  of  the  processing  system  Is  easy  to  survey.  A  special  data  exchange 
processor  provides  convenient  and  rapid  access  to  Image  data  especially  for  all  kind  of 
homogeneously  performed  window  operations. 

Some  practical  applications  (e.g.  Image  differentiation.  Image  convolution)  are  explained 
to  demonstrate  FLIP'S  system  performance.  The  experimental  results  of  the  developed  system 
show,  that  FLIP  reduces  image  processing  times  by  factors  between  10  to  100  compared  with 
conventional  techniques  (e.g.  using  a  1  MIPS  general  purpose  computer). 

1.  INTRODUCTION 


Processing  of  pictorial  data  on  a  conventional  sequential  computer  Is  very  time  consuming. 
This  Is  not  only  a  trouble  for  real-time  applications,  as  It  Is  In  most  military 
applications,  but  also  In  the  field  of  experimental  simulations.  Picture  processing 
requirements  have  Increased  so  much  In  the  past,  that  It  Is  unrealistic  to  believe  that 
today  and  In  the  near  future  the  desired  processing  times  can  be  met  by  conventional 
computers.  To  overcome  this  problem  many  special  Image  processing  computers  with  parallel 
processing  capabilities  have  been  proposed. The  performance  of  such  systems  Is  often 
evaluated  by  comparing  the  execution  time  of  a  specific  image  processing  task  on  the 
system  under  consideration  versus  the  execution  time  of  the  same  task  on  a  sequential 
computer. 

The  architecture  of  Image  processing  systems  Is  mostly  driven  by  the  structural 
requirements  of  Image  processing  tasks.  The  evaluation  of  those  requirements  show  that 
for  the  various  steps  of  Image  processing  different  structures  are  needed. 

With  the  Flexible  Image  Processing  System  (FLi.J'  we  designed  and  built  a  flexible 
processing  structure  which  combines  simultaneous  processing  and  structural  flexibility. 
Its  main  application  Is  computing  of  homogeneous  and  parallel  operations  often  used  for 
picture  preprocessl ng  and  feature  extraction.  Those  parallel  operations  are  spatial  Image 
filtering,  template  matching,  local  correlation,  ect. 

1.1.  Structural  Requirements  for  an  Image  Processing  System 

To  determine  the  processing  structure  of  a  new  system  It  Is  nessessary  to  evaluate  the 
structural  requirements  of  typical  Image  processing  tasks.  The  following  simple 
algorithm,  called  "stroke  difference"  Is  an  example. 

With  the  submatrix  notation  of  Figure  1  the  stroke  difference  Is  given  by 
Gx(x.y)  =■  S  (  F  { 1 , 1 )  +  F  (  2 , 1 )  +  F  (  3 . 1 )  -  { F  ( 1 , 3 )  +  F(2,3)  +  F(3,3))|/3 
Gy( x,y)  *  | ( F( 1 , 1 )  +  F(1 ,2)  +  F(l,3)  -  (F(3,l)  +  F ( 3 , 2 )  +  F(3, 31)1/3 
G  (x.y)  *  (Gx(x.y)  +  Gy(x,y))/2 


Thfs  processing  Is  a  local  or  window  operation.  The  picture  elements  are  taken  from 
a  submatrix  (window)  which  Is  systematically  shifted  over  the  whole  Image  and  the 
?">cessing  1$  repeated  without  change  (l.e.  homogenously)  for  each  window  position. 

••a'uate  the  structural  requirements  It  Is  useful  to  draw  the  corresponding  computing 
i-»o»  r>f  the  mathematical  expression.  For  this  purpose  the  algorithm  Is  decomposed  In 
•'•••ntary  operations.  For  the  above  example  the  operations  are  displayed  as  nodes 
...  ...  i-icetting  flow  by  connecting  arrows  In  Figure  2. 

•  r  i--aph  shows  two  specific  principles: 
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-  Parallel  processing:  This  type  of  processing  Is  characterized  by  the  concurrent 
computation  of  unrelated  operations  (e.g.  operations  1-4  In  the  first  stage 
(column),  5-8  in  the  second  stage,  ect.). 

-  Pipelining:  This  type  of  processing  Is  characterl zed  by  the  sequential  data 
flow  through  several  processors  (If  we  use  for  each  operation  one  processor) 
and  the  parallel  working  of  all  stages  (e.g.  the  operations  1,  5,  9,  11,  13 
and  14  constitute  a  pipeline). 

Both  principles  are  widely  used  to  speed  up  computation.  The  computing  graph  of  our 
example  shows  the  combination  of  these  two  principles,  what  briefly  can  be  described 
as  simultaneous  processing.  The  speed  of  computation  is  dictated  in  this  structure  by 
the  slowest  operation  within  one  pipeline.  The  problem  of  system  optimization  Is  the 
distribution  of  operations  In  such  a  way,  that  each  processor  uses  nearly  the  same 
computation  time. 

The  design  of  the  FLIP  was  greatly  Influenced  by  the  evaluation  of  Image  processing 
tasks  like  the  above  mentioned  stroke  difference  method.  As  a  result,  the  FLIP  comprises 
the  following  general  features: 

-  Mul tl processor  system:  High  computation  power  Is  provided  by  16  processors 

-  Structural  programmability:  It  is  possible  to  arrange  the  processors  In  any 
desired  structure  by  software  means 

-  Parallel  input  data  stream:  A  programmable  data  exchange  processor  (PEP) 
provides  a  high  capacity  parallel  Input  data  stream 

-  Effective  and  simple  synchronization:  The  synchronization  of  the  16  processors 
is  data  flow  controlled  and  suDported  by  hardware 

-  Programmable  operations:  Each  processor  can  be  programmed  to  perform  complex 
and  different  operations. 

2.  IMAGE  PROCESSING  SYSTEM  FLIP 

2.1.  FLIP  System  Configuration 

The  FLIP  system  Is  connected  to  a  host  computer  (Figure  3)  and  Is  operated  like  a 
peripheral  device.  Image  data  to  be  processed  often  comes  from  a  disk  and  after 
processing  the  results  are  also  stored  on  a  disk  or  alternatively  they  are  displayed 
on  a  raster-display.  The  host  computer  Itself  is  part  of  a  larger  system  and  Is  also 
connected  to  two  other  computer  systems  (with  additional  image  Input/output  devices). 
So,  the  processing  capabilities  of  the  FLIP  are  also  available  to  users  of  the  two  other 
systems . 

The  FLIP  consits  of  the  central  processing  unit  FIP  (Flexible  Individual  Processors) 
and  the  data  exchange  processor  PEP  (Peripheral  data  Exchange  Processor).  The  FIP  itself 
has  16  IP's  (Individual  Processors)  and  the  PEP,  which  is  mainly  a  fast  and  Intelligent 
buffer  device,  has  a  fast  bipolar  memory  of  24  KBytes  and  three  Internal  processors. 
Additionaly  the  FLIP  Is  connected  to  a  MOS  image  memory  (768  KBytes)  with  a  high  speed 
data  path. 

Image  processing  with  the  FLIP  Is  usually  done  on  a  data  stream  flowing  from  the  disk 
to  the  PEP,  from  the  PEP  to  the  FIP,  and  after  processing  within  FIP,  via  PEP  back  to 
the  disk.  The  three  data  streams.  Input,  processing  and  output  are  flowing  In  parallel. 
Figure  4  shows  this  data  stream  from  the  viewpoint  of  Image  processing.  Out  of  the  Input 
Image,  which  Is  transfered  pixel  by  pixel  (pixel  =  picutre  element),  the  PEP  forms  a 
parallel  data  stream  for  processing.  For  this,  the  PEP  holds  at  least  those  data,  which 
are  simultaneously  required  by  the  FIP,  e.g.  the  lines  of  the  image  covered  by  the  window 

currently  being  processed.  The  FIP  merges  the  parallel  data  stream  back  to  a  single  data 

stream,  which  Is  subsequently  stored  as  the  output  Image. 

This  double  funnel-shaped  data  flow  is  a  feature  often  found  in  image  processing  tasks. 
Our  example  of  the  "stroke  difference"  operates  exactly  In  this  manner. 

2.2.  Processing  Unit  FIP 

The  realized  FlP-System  with  16  Individual  processors  pocesses  all  the  capabilities  which 
are  required  to  achieve  a  flexible  and  programmable  structure.  The  system  works  without 
any  external  control  for  synchronization  and  the  data  flow  Is  used  to  synchronize  the 
Individual  processors.  A  processor  excuting  an  Input  Instruction  will  wait  until  valid 

data  are  present  on  the  requesting  port.  An  extensive  bus  system  Is  used  to  physically 

connect  all  Individual  processor  with  each  other  (Figure  5).  Each  Individual  processor 
has  Its  own  output  bus  with  eight  parallel  bits  of  data.  Each  of  these  buses  is  connected 
to  the  Input  ports  of  all  other  processors.  To  Increase  the  system  speed  and  to  ease  the 
programming  of  desired  structures  each  processor  Is  equipped  with  two  Independent  Input 
ports.  By  this,  each  Individual  processor  may  have  three  data  transfers  at  the  same  time, 
two  at  Its  Input  ports  and  one  at  the  output  bus.  The  Instruction  format  reflects  this 
feature  by  the  provision  of  three  adress  fields,  so  each  Individual  processor  Is  a  three 
adress  machine. 


15-3 


All  individual  processors  of  FIP  are  Identical.  In  order  to  achieve  high  processing  speed, 
data  and  instructions  are  Internally  separated  both  in  storage  and  signal  path.  In  each 
processor  the  storage  capacity  Is  50  bytes  (8  bit)  for  data  and  256  words  for  Instructions 
(32  bit).  Figure  6  shows  the  functional  block  diagram  of  an  individual  processor.  Two 
Independent  input  control  units  (IAC,  IBC)  can  simultaneously  select  one  bus  out  of  16 
buses.  The  output  control  (0C)  has  its  own  one  byte  buffer  allowing  output  data  transfer 
to  be  done  in  parallel  with  program  excution.The  Internal  program  execution  is  controlled 
by  an  asynchroneous  network.  Internal  data  are  represented  by  eight  bits.  The  Instruction 
repertoire  of  each  IP  includes  the  basic  arithmetical  and  logical  instructions,  the  8  bit 
by  8  bit  multiplication,  shift  instructions,  and  instructions  to  handle  multiple  precision 
arithmetic.  Subroutines  can  be  called  and  special  control  instructions  (e.g.  stop 
Instruction)  are  Implemented.  The  execution  time  of  each  instruction  is  dependent  on  the 
complexity  of  its  operation  and  adressing  mode.  The  measured  mean  execution  time  is 
250  nsec. 

The  data  Input  to  FIP  is  performed  by  the  PEP  on  additional  data  buses.  To  achieve  a 
parallel  data  stream  (as  required  by  the  chosen  structure  of  an  image  processing  task) 
the  FIP  and  PEP  are  connected  with  16  Independent  buses  (buses  A  and  B  of  Figure  5) 
providing  a  possible  data  rate  of  45  Mbyte/s.  Unlike  the  FIP  Internal  buses  the  A  and 
B  buses  carry  destination  addresses  and  the  PEP  can  chose  any  of  the  16  IPs  as  desti¬ 
nation.  For  data  output  from  the  FIP  three  buses  are  provided  which  are  connected  to  the 
PEP,  the  MOS  image  memory  and  the  host  computer  (with  DMA),  respectively.  The  data 
Input/output  is  done  in  parallel  with  program  execution  as  well  as  the  FIP  Internal  data 
flow  on  the  FlP-buses.  An  additional  control  and  instruction  bus  directly  controlled  by 
the  host  computer  Is  provided  for  program  loading  and  control  purposes. 

2.3.  Data  Exchange  Unit  PEP 

The  data  flow  between  FLIP  and  the  host  computer  is  accomplished  by  three  exchange 
processors  (EP),  a  data  memory  of  24  KB,  and  associated  interfaces,  respectively,  all 
comprised  within  PEP  (Figure  7).  PEP  is  designed  to  provide  Input  data  for  the  FIP 
processing  unit  and  mainly  to  support  the  homogeneous  type  of  image  processing.  A  multiple 
data  stream  is  formed  from  the  single  input  data  stream  coming  from  the  host  system. 
Working  on  a  10  x  10  submatrix,  for  example,  the  data  rate  to  FIP  can  be  100  times  greater 
than  the  rate  of  the  sequential  input  data  stream. 

The  access  to  the  data  stored  in  the  data  memory  is  free  programmable  and  almost 
unlimited.  By  this,  a  flexible  use  of  the  FLIP  system  is  guaranteed.  Additional  addressing 
hardware  is  available  for  the  homogeneous  type  of  image  processing,  so  that  the  user  must 
not  concern  about  the  special  problems  for  this  type  of  data  in-  and  output.  For  window 
operations,  the  size  of  the  data  memory  (24KB)  limits  the  maximum  image  line  lengths  to  be 
directly  processed.  For  example,  with  a  line  length  of  240  we  can  access  a  submatrix  of 
100  x  100  pixels  window  size,  whereas  with  a  line  length  of  8000  we  can  only  handle  a 
window  of  3  x  3  pixels.  Since  FLIP  acts  like  a  peripheral  device  it  must  be  controlled  and 
supervised  by  the  host  computer.  This  means  that  within  the  host  a  program  must  provide 
at  least  two  independent  data  streams: 

-  One  input  data  stream  usually  originating  from  a  mass  storage  device 
(e.g.  disk) . 

-  One  output  data  stream  going  to  a  mass  storage  device  or  a  display  device. 

Within  the  FLIP  system  the  input  data  stream  is  handled  by  the  input/output  controller 
(Figure  7)  and  transferred  to  the  PEP  data  memory.  The  multiple  data  stream  to  the  FLIP 
is  controlled  by  the  three  PEP  processors.  The  data  then  are  temporarely  stored  in  the 
data  buffers  and  transferred  on  16  buses  to  the  FIP  processors.  The  results  from  the  FIP 
are  returned  and  transferred  to  the  host  or  to  other  storage  devices  under  the  control  of 
the  input/output  controller. 

2.4.  FLIP  -  Principle  of  Operation 

To  establish  a  desired  processing  structure  on  FLIP,  the  Individual  processors  of  FIP 
as  well  as  the  PEP-processors  (EP1-EP3)  must  be  programmed.  As  soon  as  all  programs 
are  loaded  the  processing  structure  is  latent  in  the  system  and  processing  can  be  started. 
This  Is  usually  Intiallzed  by  the  data  transfer  from  the  PEP  to  the  FLIP.  The  PEP 
itself  gets  the  data  from  the  host  computer  by  DMA  (Direct  Memory  Access).  This  data  flow 
scheme  provides  the  host  computer  with  complete  control  over  the  execution  of  the  entire 
processing.  Within  FIP  any  desired  structure  can  be  established  by  programming  from  the 
pipeline  over  the  cascade  up  to  other  parallel  structures.  Two  simple  examples  are 
illustrated  in  Figure  8.  The  only  means  to  establish  a  connection  between  two  FIP- 
processors  is  by  the  address  parts  (operand  address  *  bus  address)  within  the  instruction 
words  of  the  processor  requesting  the  data.  In  this  way,  the  construction  of  a  logical 
data  path  Is  done  by  the  processors  receiving  the  data.  In  that  processors  which  are  to 
send  the  data,  output-instructions  according  to  the  data  requests  must  be  programmed  (see 
example  in  next  section).  As  soon  as  valid  data  is  present  on  a  selected  bus  it  is  latched 
and  confirmed  by  the  input  port  of  the  processor  executing  the  Input-instruction.  Since 
each  processor  possesses  two  independent  input  ports  the  temporal  appearance  of  data  from 
different  buses  is  of  no  importance.  In  the  case  of  an  input  operation  without  valid  data 
at  one  or  both  input  ports,  respectively ,  the  processor  will  remain  in  a  waiting  state. 
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The  data  output  1$  Initiated  after  the  execution  of  an  Instruction  containing  the 
addressing  mode  ‘data  to  output-bus1.  The  data  Is  subsequently  latched  for  output  and 
program  execution  Is  resumed.  If  the  last  data  wasn't  taken  by  another  processor  the 
execution  of  a  second  Instruction  with  output  to  bus  has  to  wait  until  the  first  data 
transfer  Is  completed.  If  the  previous  data  transfer  was  already  finished  the  execution 
proceeds  without  Interruption .  In  this  way,  the  FLIP  has  an  asynchronous  sequence  control 
achieved  by  mutual  sychronl zatlon  of  the  Individual  processors  with  their  joint  data 
fl  ow. 


2.5.  Supporting  Software 

The  FLIP  Is  programmed  In  Its  machine  languages  called  FAL  (FIP  Assembly  Language)  and  PAL 
(PEP  Assembly  Language),  respectively.  The  object  modules  are  linked  by  FLINK  (FLIP 
LINKage  editor)  which  arranges  the  relative  bus  addresses  within  the  objects  to  establish 
the  desired  logical  structure  on  FLIP.  Program  execution  Is  initiated  and  supervised  by 
the  FLEX  (FLIP  Executive)  utility  running  on  the  host  computer.  FLEX  controls  all  steps: 
The  transfer  of  Image  data  from  a  mass  storage  to  FLIP,  the  FLIP  execution,  and  finally 
the  storing  of  the  results  produced  by  FLIP.  Additional  features  of  the  control  program 
FLEX  are  the  capability  to  debug  FLIP  programs  and  to  locate  program  errors  by  the  use  of 
trace  routines,  and  to  evaluate  program  efficiency  by  monitoring  the  activities  of  FLIP 
(FIP  and  PEP)  by  time  measurements. 

3 .  FLIP  IMAGE  PROCESSING  APPLICATIONS 

Considering  the  special  capabilities  of  FLIP  described  in  the  previous  sections,  one 
can  easily  realize  that  FLIP  will  find  advantageous  applications  In  the  large  field  of 
picture  processing.  A  great  number  of  frequently  used  tasks  like  e.g.  simple  array 
operations  for  Image  filtering  (linear  or  nonlinear),  contour  sharpening,  and  Image 
enhancement  Imply  parallel  processing  structures  to  be  directly  Implemented  with  FLIP. 
Obviously,  other  time  consuming  Image  processing  functions,  e.g.  local  Image  correlation. 
Image  convolution  or  some  gray  level  statistics  are  also  suitable  to  be  Implemented  on 
the  FLIP,  which  already  was  successfully  done.  In  the  following  a  selection  of  processing 
structures  will  be  given  In  order  to  demonstrate  FLIP  image  processing,  to  give  a 
performance  evaluation,  and  to  give  a  programmer's  look  to  FLIP  by  means  of  programming 
techniques  for  FIP  and  PEP  for  two  simple  examples. 

3.1.  Element  Difference 

The  element  difference  method  Is  a  local  operation  used  for  generating  of  the  first 
derlvate  of  a  picture  F.  It  Is  a  special  case  of  the  stroke  difference  method  already 
described  In  the  second  section.  The  value  of  the  difference  G( x,y)  defined  In  both 
algorithms  may  be  treated  as  a  measure  to  characterize  the  gradient  of  the  Intensity 

distribution  In  this  position  (x,y).  With  the  notation  given  In  Figure  1  the  element 

difference  Is  given  by: 

Gx(x,y)  *  I F ( 2 , 1 )  -  F ( 2 , 3 )  | 

Gy(x.y)  »  1 F ( 1 ,2 )  -  F(3,2)| 

G  ( x,y )  =  (Gx(x,y)  +  Gy(x,y))/2. 

Considering  the  principles  leading  to  the  processing  structure  for  the  stroke  difference, 
there  Is  no  difference  In  establishing  a  computing  graph  for  the  element  difference 
function  (Figure  9).  This  processing  structure  for  the  element  difference  Is  characterized 
by  three  pipelines  with  three  processing  stages  each.  Obviously,  there  are  four  processors 
needed  at  relatively  equal  load  to  perform  the  element  difference  for  one  submatrix 
position.  However,  this  would  be  a  bad  system  usage  of  FLIP  as  the  other  12  IPs  of  FIP 

would  remain  unused.  Due  to  the  structural  flexibility  of  FLIP,  In  our  example  It  Is 

possible  to  speed  up  computation  by  means  of  additional  simultaneous  processing,  e.g. 
multiple  organization  of  the  simple  processing  structure  for  the  element  difference 
as  far  as  processors  are  available.  Figure  9  shows  12  IPs  organized  In  three  pipelines 
producing  the  element  differences  of  three  different  submatrix  positions  at  a  time.  The 
processor  In  the  fourth  stage  of  the  processing  cascade  collects  the  results  and  outputs 
them  for  the  purpose  of  storage  or  further  processing,  respectively.  The  data  Inputs  for 
processors  I P 1  -IPS  are  fed  from  the  PEP.  Since  PEP  provides  Input  data  to  FIP  at  a  rate  of 
up  to  45  Mbyte,  the  loading  of  IP1-IP6  with  Input  data  can  be  considered  as  to  happen  In 
paral 1  el . 

Mow,  to  show  how  simple  It  Is  to  establish  such  a  processing  structure  and  to  organize 
the  data  transports,  the  Individual  programs  for  the  PEP-processor  EP1  and  the  FIP- 
processors  I P 1 .  IP2,  and  IP3  are  listed  below: 

PEP  -  processor  EP1: 


START: 

MOV 

F ( 2 ,  1 )  ,IP1(A1) 

;  Transfer  pixel  F ( 2 , 1 )  from  buffer  memory  via  bus  A1  to 

;  processor  IP1 

MOV 

F  ( 2 , 3 )  ,IP1(B1) 

;  same  with  F(2,3)  and  bus  B1 

MOV 

F ( 1 ,2) , I P  2 ( A1 ) 

;  same  with  F ( 1 , 2 )  and  bus  A1  for  processor  IP2 

MOV 

F ( 3 , 2  )  ,  I P2( B1 ) 

;  etc. 

IBR 

P, START 

;  Increment  pointer  to  next  window  position  and  branch 

to  START 


F I P  -  processor  IP1  and  IP2: 


START: 

SUB  A1.A2.SC1 

MO VM  SCI, OB 

BR  START 


Subtract  gray  levels  of  opposite  pixels 

Output  the  absolute  value 

Goto  START  (next  window-position) 


Processor  IP3: 


START: 

CLR  SCI 

LOOP: 

ADO  IP1.IP2.0B 
ADC  SCI, OB 

BR  LOOP 


Provide  ZERO-value 

Add  differences  and  output  less  significant  byte 
Output  Carry  (multiple  precision  add) 
etc . 


Oue  to  similarities  between  the  programs  listed  above  and  the  programs  devoted  for 

processors  EP2,  EP3,  and  IP5  -  I P 1 2 ,  the  latters  are  omitted  here. 

Finally,  the  output  data  rate  Is  dictated  by  the  processors  with  the  heaviest  load  within 

the  pipeline.  The  heaviest  load  found  In  the  structure  Is  3  Instructions  per  window 
position.  Table  1  at  the  end  of  this  section  lists  the  selected  FLIP  Image  processing 
functions  with  their  execution  time. 

3.2.  Two-Dimensional  Convolution 

Two-dimensional  convolution  Is  frequently  used  In  spatial  filtering,  template  matching, 
etc.  It  operates  like  a  window  operation  and  Is  described  by  the  following  expression: 

G(x,y)  =  E  Ew(1,j)  f{i  ,j  ,  (  x,y )} 

1  j  1 

where  W(1,J)  Is  the  weight  matrix  and  F  1,j,(x,y)  are  the  Image  points  covered  by  the 
weight  matrix  at  locations  (x,y).  Usually  applied  submatrix  dimensions  are  ranging  from 
3  x  3  to  11  x  11  Image  elements.  With  FLIP  it  Is  possible  to  achieve  one  processing 
structure  that  Is  applicable  for  all  submatrix  dimensions  considered.  Nevertheless, 
processing  structures  will  be  more  efficient  if  they  are  only  devoted  for  one  or  perhaps 
two  different  submatrix  sizes.  Figure  10  shows  a  FLIP  processing  structure  working  on 
submatrix  sizes  of  5  x  5  and  7x7  pixels. 

3.3.  Execution  Times 

Table  1  shows  the  execution  times  for  the  above  mentioned  image  processing  tasks.  Not 
Included  In  these  times  are  the  initialization  and  program  load  times. 


Tabl  e 

4. 


A  powerful  and  suitable  Image  processing  system  has  been  proposed  and  realized.  The 
developed  system  possesses  a  flexible  multiprocessor  system  for  central  processing 
and  a  high  speed  data  exchange  processor  for  fast  and  convenient  data  Input/output. 
The  FLIP  Is  able  to  perform  up  to  64  MIPS  (Million  Instructions  per  Second)  and  can 
process  Image  line  lengths  of  up  to  10  k  Byte  (depending  on  the  submatrix  size)  and 
an  unlimited  number  of  lines. 

Homogeneous  operations  In  the  field  of  Image  preprocessing,  local  Image  correlation 
and  operations  for  feature  extraction  have  been  programmed.  It  was  found,  that  In  the 
most  cases  It  takes  less  than  5  seconds  to  perform  the  considered  Image  processing  tasks 
on  Images  with  1024  x  1024  pixels.  The  time  required  for  processing  Is  often  less  than  the 
time  that  Is  needed  to  Initialize  the  processing  system  and  to  take  the  Image  data  to  and 
from  the  storage  device. 

Due  to  Its  flexibility  and  processing  power  we  believe  FLIP  will  find  practical 
application  In  other  tasks  of  the  same  type,  namely  straightforward  operations  applied  on 
a  1 arge  set  of  data . 
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AIDE  A  L ' INTERPRETATION  EN  IMAGERIE  SATELLITE  MULTISPECTRALE 


M.  REBUFFET  Laboratoire  de  Traitement  d' Images 

Etablissement  Technique  Central  de  l’Armement 
94114  ARCUEIL  CEDEX 
FRANCE 


Cette  contribution  aborde  plusieurs  aspects  de  l'aide  que  les  mAthodes  de  traitement  d'  images  sur 
ordinateur  peuvent  apporter  A  1' extraction  de  1 ' information  contenue  dans  les  images  prises  par  un  satellite 
d' observation. 

Cette  extraction  a  pour  but  d'Atablir  une  cartographie  et  une  identification  de  planimAtrie.  Elle 
fait  appel  A  des  photointerprAtes,  pour  lesquels  on  doit  limiter  le  nombre  de  documents  A  consulter,  et  fa- 
ciliter  1' inter face  Homme- Machine. 

Trois  aspects  sont  dAveloppAs  dans  cette  contribution  : 

I  Amelioration  d'une  image  multispectrale  et  reduction  de  la  redondance, 

II  Correction  de  la  geometrie  d'une  image  par  rapport  A  une  reference, 

III  Extraction  interactive  des  informations  de  type  graphique. 

Ces  methodes  ont  Ate  mises  en  oeuvre  au  Laboratoire  de  Traitement  d' Images  de  l'ETCA,  sur  une 
configuration  de  materiel  de  presentation  et  de  traitement  A  haut  degre  d' interactivite. 


I  -  AMELIORATION  D'UNE  IMAGERIE  MULTISPECTRALE  ET  REDUCTION  DE  LA  REDONDANCE 

Le  satellite  LANDSAT  produit  des  enregistrements  sur  bande  magnetique  de  1' image  multispectrale 
de  la  surface  terrestre.  Ces  donnees  font  depuis  longtemps  dejA  l'objet  de  recherches  destinees  A  mieux 
connaltre  les  ressources  terrestres. 

Dans  les  etudes  qui  font  l'objet  du  present  rapport,  nous  avons  utilise  ces  donnees  pour  evaluer 
des  methodes  de  traitement  de  donnees  multi spectrales  aAriannes,  et  d' extraction  ou  detection  d'objets  d'in- 
terfit  irilitaire. 

La  finesse  des  images  donnees  par  ce  satellite  est  adaptee  A  des  objets  ou  sites  de  grandes  dimen¬ 
sions,  super ieures  A  50  m.  Pour  les  objets  plus  petits,  une  extrapolation  est  necessaire,  mats  les  idees 
generales  qui  se  dAgagent  de  cette  etude  sont  adaptables  facilement  et  pourront  aider  A  la  conception  de 
systAmes  de  traitement  d' images  de  satellites  d ' observation  militaire. 

Le  problAme  principal  de  la  detection  d'objets  d'intArAt  militaire  est  que  1 ' information  princi- 
pale  qu'on  cherche  A  extraire  est  la  perturbation  que  l'homme  apporte  au  paysage  naturel.  Cette  perturba¬ 
tion  est  faible  en  terme  de  pourcentage  de  surface  de  l'imagerie  aArienne.  Les  traitements  statistiques 
seront  done  dans  cette  mesure  deiicats  A  utiliser. 

La  presente  etude  porte  sur  une  mAthode  de  traitement  statistique,  la  transformation  de  Karhumen- 
Loeve  dans  l'espace  chromatique,  ou  analyse  factorielle  qui  permet  d'obtenir  thAoriquement  A  partir  de  don- 
nAes  multispectrales,  des  images  dAcorrAlAes  (dont  les  informations  donnAes  sur  chaque  image  sont  de  nature 
diffArente)  et  de  meilleur  contraste. 

Cette  transformation  est  trAs  utilisAe  en  Ressources  Terrestres,  mais  od  en  gAnAral  les  donnAes  A 
dAtecter  sont  importantes  en  pourcentage  surfacique. 

La  transformation  de  Karhumen-Loeve  permet  d' avoir  de  nouvelles  images  dont  le  rapport  signal  sur 
bruit  est  rangA  suivant  des  valeurs  dAcroissantes.  Les  photointerprAtes  en  ressources  terrestres  voient  en 
gAnAral  apparaltre  des  renseignements  intAressants  sur  la  deuxiAme  image  transformAe,  od  la  nature  du  sol 
apparalt  et  od  les  ombres  ont  presque  disparues.  La  premiAre  image  Atant  1' image  "panchromatique"  du  paysage 
qui  au  contraire  fait  apparaltre  plutdt  les  ombres.  Les  militaires  par  contre  sont  intAressAs  a  priori  par 
les  deux  informations. 

En  consAquence  nous  avons  AtudiA  sur  une  zone  od  existait  un  certain  nombre  d' installations  ou 
sites  d'interAt  militaire,  elles  que  zones  industrielles,  zones* urbaines ,  voles  ferrAes,  routes,  aArodrotnes. 
Nous  avons  appliquA  la  trams formation  sur  1' ensemble  de  1* image  et  sur  un  certain  nombre  de  zones  intAres- 
santes.  Nous  avons  mesurA  la  stabilitA  de  cette  transformation  sur  1' ensemble  de  1' image  et  obtenu  des  res¬ 
titutions  photographiques ,  de  chaque  image  transformAe. 


ANALYSE  FACTORIELLE 


La  mAthode  de  1* analyse  factorielle  est  une  approche  systAmatique  mais  linAaire  d'un  problAme  exprim A  par 
les  questions  suivantes  : 

-  Combien  y  a-t'il  de  variables  indApendantes  ? 


-  Comment  les  obtenir  A  partir  des  variables  mesurAes  ? 
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Dans  le  cas  de  donndes  naturelles  qui  possident  un  caracttre  statistique,  on  peut  aussi  formuler  la  question 
suivante  : 


-  Quelle  fraction  de  la  variance  das  donndes  ndgllge-t-on,  lorsque,  apr£s  avoir  ddcorrdld  les  don- 
n€es,  on  ne  conserve  que  m  variables  m  <  k  ? 


Admettons  que  nous  ayons  fait  N  experiences,  dans  lesquelles  pour  chacune  on  a  enreqistrd  k  signaux  ex pe ri¬ 
mer,  taux  .  Les  k  sources  de  signaux,  dventuellement  entachde  d'erreurs  gaussiennes  en  premiere  approximation, 
sont-elles  inddpendantes,  corrdldes  ou  redondantes  ? 


Soient  (x^j)  i  -  1 ,N,  J 


l,k  les  mesures  ;  soit 


l 

N 


z; 

i  -  1,N 


x,3 


les  moyennes  de  cheque  signal  et 


°J2 


1 

N-l 


1,N 


(Xjj  " 


leurs  ecarts-types.  On  construit  la  matrice  de  correlation  entre  les  k  si- 


gnaux  par  : 


m 


al°  m 


r 

i  =  1  ,N 


(x 


i  ,t 


-  V  <Xi,m  '  V 


Si  les  k  signaux  sont  ind6pendantsf  cette  matrice  est  diagonale.  S'ils  sont  correies  mais  non  redondants, 
cette  matrice  n'est  pas  diagonale,  mais  n'est  pas  singuliere.  Si  les  signaux  sont  redondants,  la  matrice 
est  singuliere,  et  se  reduit  £  une  matrice  de  rang  inferieur.  Voir  figure  3  la  representation  de  ces  trois 
cas  lorsque  k  =  2. 


Dans  le  cas  de  signaux  gaussiens,  a  ,  les  axes  de  1' ellipse  paralieies  aux  axes  du  rep£re  caracterisent 
1 ' independence.  Dans  le  cas  de  signaux  correies  b  ,  les  axes  sont  obliques.  Dans  le  cas  de  la  redondance 
c  l'ellipse  se  reduit  &  une  droite.  (voir  figures  la  -  lb  -  lc) . 

Pour  obtenir  les  inclinaisons  des  axes  principaux  de  cette  ellipse,  ou  de  1 fhyperellipsoide  dans  le  cas 
general  il  suffit  de  diagonaliser  la  matrice  de  covariance  et  les  vecteurs  propres  representent  l1 impor¬ 
tance  relative  des  diametres  de  l'ellipsotde  relatifs  &  chaque  axe.  Chaque  axe  represente  un  "facteur"  ou 
"composante  principale" ,  chaque  valeur  propre  mesure  1' importance  de  ce  facteur. 

Par  ailleurs  si  chaque  mesure  est  entachee  d'une  erreur  (cf  2.3.),  et  si  ces  erreurs  sont  independantes ,  ce 
qui  correspond  a  la  notion  de  bruit  blanc,  le  rapport  S/B  que  represente  le  rapport  entre  un  facteur  et  la 
projection  du  bruit  sur  son  axe,  est  proportionnel  A  la  valeur  propre  correspondante . 


Application  aux  donnees  multispectra les 

On  applique  la  methode  precedente,  les  N  experiences  etant  les  N  points  (ou  pixels)  de  1' image  et  les  k 
signaux  les  k  longueurs  d'onde  d'analyse  (ici  k  =  4) . 

Done  faire  1' analyse  f actor ielle  d'une  image  multispectrale  e'est  mettre  en  evidence  des  ensembles  de  qua- 
tre  matrices  de  nombres  representant  de  nouvelles  images  dans  lesquelles  une  "couleur"  particuliere  est 
renforcee.  La  transformation  de  Karhumen-Loeve  consiste  A  cr6er  ces  nouvelles  images,  oil  le  rapport  signal/ 
bruit  est  optimise  pour  la  premiere  image,  puis  pour  la  deuxieme,  celle-ci  etant  decorreiee  avec  la  pre¬ 
miere,  puis  pour  la  troisi6me,  decorreiee  avec  les  deux  premieres  etc... 


Organigramme  de  1' analyse  factor ielle 
Le  programme  doit  executer  plusieurs  tAches  : 

a)  Calcul  des  moyennes  des  donnees 

b)  Calcul  des  variances  des  donnees 

c)  Calcul  de  la  matrice  de  correlation 

d)  Recherche  et  tri  des  valeurs  propres  et  vecteurs  propres 

e)  Calcul  de  la  matrice  d' hyperrotation  permettant  de  passer  des  variables  mesurees  aux  nouvelles 
variables  orthogonales. 


*  APPLICATION  A  UNE  IMAGE  COMPLETE 


Resultats  et  remarques 

La  correlation  entre  les  canaux  1  et  2  est  la  plus  forte  :  0,90  ;  elle  est  en  revanche  tres  faible  entre 
les  canaux  1  et  4  :  0,21.  La  decroissance  rapide  des  valeurs  propres  montre  que  le  nuage  statistique  est 
tres  plat,  et  se  developpe  essentieilement  dans  le  plan  forme  par  les  deux  premiers  vecteurs  propres. 

L'epaisseur  du  nuage  statistique  autour  de  ce  plan  est  tres  faible  ce  qui  signifie  qu*en  ne  conservant  que 
les  deux  premieres  images  transformees,  on  neglige  seulement  4%  de  1  *  information  statistique.  La  visuali¬ 
sation  des  deux  dernieres  images  transformees  confirme  le  faible  int6r6t  de  ces  deux  dernieres  images,  qui 
contiennent  surtout  du  bruit. 
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Toutefois,  le  long  du  troisiAme  axe,  on  reconnait  en  dehors  du  bruit  une  information  qui  repr6sente  le  gau- 
chissement  du  nuage  statistique  autour  du  plan  defini  par  les  deux  premiers  axes,  et  qui  met  en  Evidence 
certains  details  A  haute  frequence  spatiale  dans  1' image. 


Les  deux  premieres  images  transformAes 

La  premiere  image  transformee  est  obtenue  par  une  ponderation  pratiquement  identique  des  quatre  canaux  bruts. 
Elle  a  done  l1 aspect  de  1* image  panchromatique  qu'on  aurait  obtenu  avec  un  capteur  A  large  bande  spectrale 
(0,5  A  l,lp).  En  effet  l'Acart  angulaire  dans  1 1 hyperespace  entre  la  combinaison  linAaire  calcuiee  et  la 
simple  moyenne  est  6gal  a  0,1  rd. 

Le  rapport  signal/bruit  de  1 ' image  est  deux  fois  plus  grand  que  dans  les  images  brutes,  ce  qui  se  concre¬ 
tise  par  une  nette  attenuation  des  effets  de  trame  existant  dans  les  images  LANDSAT. 

Les  points  les  plus  clairs  repr6sentent  les  carriAres,  les  digues,  les  lits  ass6ch£s  des  rivieres,  les  zo¬ 
nes  A  vegetation  eparse  sur  sol  caillouteux  ainsi  que  les  structures  en  beton  en  particulier  1 ‘aerodrome 
d‘ Orange . 

Les  points  les  plus  sombres  representent  l'eau  des  rivieres,  les  lacs,  ainsi  que  la  vegetation  arborescente . 

La  seconde  image  transformee  est  obtenue  comrne  difference  entre  les  canavx  (0,7  -  0,8u)  et  (0,8  -  l,lu)  et 
les  canaux  (0,5  -  0,6p)  et  (0,6  -  0,7u).  On  y  voit  done  apparaltre  en  tres  clair  1' ensemble  des  vegetaux 
(forte  reflectance  dans  le  proche  I.R.  et  faible  reflectance  dans  le  visible).  Par  opposition,  et  ceci  est 
d'un  grand  interAt  pour  le  but  vise,  tout  ce  qui  est  mineral  se  distingue  en  sombre  :  eau  des  rivieres, 
cailloux  des  lits  des  rivieres,  tuiles  et  metal  des  toits  des  constructions  dans  les  villes,  dalles  de  be¬ 
ton  des  usines,  beton  et  bitume  des  pistes  des  aerodromes,  voles  de  communications  importantes,  gares  de 
triage.  On  remarquera  en  particulier  que  dans  cette  seconde  image,  on  peut  mesurer  sans  erreur  la  longueur 
des  pistes  des  aerodromes,  aloes  qu’aucune  autre  image  (brute  ou  premiere  transformee)  ne  le  permet  lorsque 
ces  pistes  n'utilisent  pas  le  m‘eme  mater iau  d'extremite  en  extr6mite. 

L‘ application  de  la  transformation  de  Karhumen-Loeve  aux  images  multispectrales  entieres,  ou  fractions 
d‘ images  de  plus  en  plus  petites  a  permis  de  mettre  en  evidence  plusieurs  resultats  : 

1.  La  perte  d ‘information  a  priori  (et  non  utile)  que  represente  le  passage  de  quatre  images  brutes  a  deux 
images  tranform6es  est  inferieure  a  5%. 

2.  Ce  qui  est  plus  important  e'est  que  1 1 information  utile  contenue  dans  les  deux  dernieres  images  trans- 
formees  est  semantiquement  trAs  faible.  En  effet,  on  voit  se  concentrer  dans  ces  deux  dernieres  images 
tout  le  bruit,  et  la  notion  de  forme  en  est  pratiquement  absente. 


3.  La  qualite  des  deux  premieres  images  transformees  est  nettement  superieure  A  celle  des  images  brutes. 
Ces  images  sont  debruitees.  Elies  rendent  done  le  travail  du  photointerprete  plus  sQr. 


4.  On  peut  attacher  A  chacune  de  ces  deux  images,  une  signification  :  image  panchromatique  pour  la  premiere 
et  image  separant  le  mineral  du  vegetal  pour  la  seconde. 


5.  Lorsqu'on  s' attache  A  des  zones  de  petite  taille,  le  plan  qui  definit  les  deux  premieres  images  trans¬ 
formees  est  d* orientation  pratiquement  constante.  Ceci  est  tres  important,  car  toute  rotation  dans  ce 
plan  conserve  la  mfime  information  statistique,  on  pourra  done  rechercher  ensuite  dans  ce  plan,  la  rota¬ 
tion  qui  au  regard  de  certains  criteres  apportera  la  plus  grande  information  utile  (au  sens  s6mantique) . 


6.  La  stabilite  de  ce  plan  autorise  A  echantillonner  dans  une  grande  proportion  le  signal  lorsqu'il  s'agit 
d'en  calculer  les  parametres. 


7.  La  notion  de  signature  spectrale  reduite  A  ce  plan  a  un  sens,  ce  qui  ouvre  la  porte  A  une  aide  informa- 
tisAe  rapide  et  peu  coOteuse  pour  les  photointerpretes .  En  particulier,  une  classification  n'utilisant 
que  ces  deux  premieres  composantes  est  remarquablement  signif icatives. 


II  -  CORRECTION  DE  LA  GBOMETRIE  D'UNE  IMAGE  PAR  RAPPQBT  A  UNE  REFERENCE 

Un  satellite  d ‘observation  donne  de  la  fraction  du  sol  qu'il  observe  une  image  deformAe  par  son  attitude 
et  ses  variations  d 'attitude,  1'atmosphAre  et  le  relief.  Lorsque  ces  deformations  ne  sont  pas  trop  impor- 
tantes,  el les  ne  g£nent  en  rien  la  photo- interpretation  de  la  scene.  Mais  les  deformations  m&nes  faibles 
empechent  toute  comparaison  automatique  de  deux  vues  prises  A  des  dates  differences,  ainsi  que  tout  posi- 
tionnement  des  elements  extraits  sur  un  referential  sol.  Cette  reference  peut  &tre  soit  une  carte  si  on  en 
dispose,  soit  une  autre  vue  de  la  sc£ne,  choisie  pour  sa  bonne  qualite. 

Si  certaines  causes  de  deformations  sont  model isables  et  mesurables,  el les  seront  prealablement  corrigees 
de  fagon  deterministe . 

Pour  les  autres  causes,  seul  un  calage  interactif  de  la  sc£ne  sur  la  reference  peut  Atre  utilise. 

C'est  cette  procedure  qui  est  decrite  ici,  avec  ses  etapes  successives  et  les  methodes  utilisees. 


J 
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ANALYSE  ALGORITHMIQUE  DE  LA  CHAINE  DE  RECALAGE 
SELECTION  DE  LA  ZONE 

La  chalne  interactive  de  rectification  d' images  accede  4  des  donn€es  cartographiques  ou  radiometriques  qui 
sont  stockees  sur  des  m&noires  de  rafraichissement  et  son  limit£es  au  format  512  x  512. 

Une  mise  en  coincidence  grossiere  des  2  fenfetres  homologues  est  effectude  en  sous-£chantillonnant  au  pas  de 
deux  et  en  pr£sentant  4  l'operateur  une  fenfitre  centrale  de  256  x  256  de  la  reference  ou  1 '  image  brute  en- 
ti£re.  En  visualisant  en  alternance  les  2  m£moires  et  en  d£plagant  un  cadre  256  x  256  dans  le  plan  graphi- 
que  associe  4  1 ' image  brute,  l'operateur  extrait  une  fenfitre  dont  le  centre  coincide  4  peu  pr£s  avec  le 
centre  de  la  fenStre  de  reference. 

Les  coordonnees  du  coin  sup£rieur  gauche  du  cadre  graphique  sont  stock£es  sur  fichier  et  lues  ensuite  par 
le  programme  de  transfert  de  bande  4  disque.  Les  donnees  de  travail  sont  d£s  lors  stockees  sur  les  m£moires 
de  rafraichissement  de  la  console  de  visualisation. 


CHOIX  DES  "AMERS"  SUR  LA  REFERENCE 

Par  1 ' interm£diaire  d'un  curseur  carr£  de  32  x  32,  l'operateur  seiectionne  le  domaine  inscrit  dans  le  cur- 
seur  en  tant  "qu'amer"  de  1* image  de  reference.  Les  statistiques  associ£es  4  cet  "amer" ,  la  distribution  en 
radiomdtrie,  ainsi  que  la  localisation  du  coin  sup£rieur  gauche  sont  stockes  pour  'Tamer"  courant  sur  le 
fichier  amers. 

II  faut  souligner  ici  un  probl£me  associe  a  la  designation  par  curseur  sur  le  plan  image  de  la  console  de 
visualisation  : 

La  localisation  en  ligne  comporte  une  incertitude  de  1  pixel. 

Le  nombre  Md' amers"  saisis  conditionne  la  qualite  de  la  regression  des  mesures  sur  la  fonction  polynomiale 
modeiisant  les  deformations. 


POINT AGE  DES  "AMERS"  SUR  LA  SCENE 

On  presente  successivement  4  l'operateur  en  3  passes  1* image  de  reference  dans  la  moitie  superieure  de  l'£- 
cran  et  1* image  brute  dans  la  moitie  inf6rieure,  ainsi  que  les  "amers"  seiectionnes.  Pour  chaque  "amer"  vi¬ 
sualise  comme  un  carre  inscrit  dans  un  disque,  l'operateur  doit  pointer  la  zone  correspondante  de  1' image 
brute. 

Un  aff inage  de  cette  procedure  de  pointage  peut  £tre  realise  sur  des  donnees  image  en  cherchant  le  maximum 
de  la  matrice  de  correlation  sur  des  fenfitres  de  32  x  32.  La  correlation  est  calcuiee  globalement  4  l'aide 
des  transformees  de  Fourier  discretes  par  la  formule  : 

-  CCF  =  i  5"  (  ¥{t)  nY  (g)} 

od  N  est  un  facteur  de  normalisation  et  l'operateur  transformee  de  Fourier. 

Le  pic  de  cette  matrice  est  recherche  dans  un  voisinage  16  x  16  du  centre  et  s'il  existe,  sa  position  est 
donnee  par  les  coordonnees  barycentriques  dans  une  fenfitre  5x5  autour  du  pic. 

La  dynamique  de  la  matrice  de  correlation  est  un  indicateur  de  qualite  sur  "l'amer"  courant  qui,  s'il  est 
inferieur  4  un  certain  seuil,  peut  conduire  au  re jet  de  "l'amer"  par  l'operateur. 

Le  calcul  des  transform6es  discretes  sur  une  fendtre  32  x  32  introduit  des  erreurs  de  phase  lorsque  des 
discontinuites  aux  frontidres  existent,  l'operateur  peut  done  optionnellement  appliquer  une  fonction  fe- 
nfttre  (i,  e,  Gaussienne)  aux  zones  traitees. 


CALCUL  DE  LA  FONCTION  DE  DEFORMATION 

Les  couples  de  points  homologues  sont  ensuite  utilises  pour  faire  un  ajustement  polynomial  aux  moindres 
cartes  sur  1' expression  : 

X(xy)  -  b(l)x+b(2)x2+b(3)y+b(4)y2+b(5)xy+b(6)xy2+b(7)x2y+b(8)x3+b(9)y3+b(10) 

X(xy)  *  c  (l)x+c  (2)  x2+c (3)y+c  (4)y^+c  (5)xy+c  (6)xy24-c  (7)x2y+c  (8)x^+c  (9)y3+c  (10) 
ou  sur  des  expressions  comparables  pour  des  polyndmes  de  degre  1  ou  2 . 

Dans  cette  expression,  X  et  Y  sont  les  coordonnees  ecran  du  centre  de  la  fendtre  image  brute  associee  4 
"l'amer",  x  et  y  les  coordonnees  du  centre  de  "l'amer",  et  b,  c  les  coefficients  de  la  transformation. 

%  'Kj 
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Pour  des  deformations  localement  importantes,  une  version  an>eiior6e  de  cette  procedure  de  modeiisation  con- 
sisterait  a  calculer  les  transform6es  des  "amers"  de  reference  avant  de  fairs  la  correlation  nuaerique  avec 
les  imagettes  de  1' image  brute,  de  fagon  a  obtenir  une  meilleur  fiabilite  dans  la  recherche  du  pic. 


RECTIFICATION  DE  L’ IMAGE 

Lors  de  la  rectification,  on  calcule  les  transformees  des  noeuds  d’une  grille  reguliere  de  1' image  de  refe¬ 
rence  et  on  associe  au  point  transforme  ou  "point  pere"  une  radiometrie  deduite  par  interpolation  des  ra- 
diometries  de  son  voisinage  dans  1* image  brute. 

-► 

Si  P  designe  la  transformation  polynomiale  dont  les  coefficients  et  £  ont  ete  calculus  dans  l'etape  de 
modeiisation,  la  rectification  comprend  3  phases  : 

-  calcul  du  transforme  P(i,j)  pour  chaque  point  (i,j)  de  la  grille  de  reference  : 

-  determination  du  voisinage  de  1' image  brute  associe  au  point  pere  et  du  jeu  de  coefficients  d' interpola¬ 
tion  correspondent  £  la  position  de  ce  point  dans  le  sous-maillage  d'une  maille  eiementaire, 

-  calcul  de  la  radiometrie  du  point  transforme  par  interpolation  bilineaire  dans  un  voisinage  4x4  autour 
du  point  pere. 


Comme  la  rectification  est  tres  cotiteuse  en  temps  de  calcul,  la  programnation  de  ces  3  phases  a  ete 
optimisee  : 

-  Pour  chaque  ligne  d'un  bloc  d' image,  les  coordonnees  des  points  transformes  sont  calcuiees  iterativement 
e  partir  des  coordonnees  du  premier  point.  La  representation  interne  de  ces  coordonnees  est  en  format 
virgule  flottante  pour  les  transformations  de  degre  >  2,  et  en  entier  etendu  pour  les  transformations  li- 
neaires  s  cette  derniere  representation  permet  d'eviter  d' avoir  recours  aux  operateurs  virgule  flottante 
peu  performants. 

-  Un  sous-maillage  de  8  x  8  pour  une  maille  eiementaire  offre  une  precision  suffisante  pour  les  interpola¬ 
tions  bilineaires  ;  on  doit  done  stocker  64  jeux  de  matrices  de  coefficients  ^  de  taille  4x4,  et  acce- 
der  pour  le  point  courant  d’une  ligne  au  jeu  correspondant . 

Ces  coefficients  sont  stockes  comme  des  entiers  et  comprennent  done  un  facteur  de  normalisation. 

-  La  radiometrie  du  point  transforme  est  calcuiee  en  sommant  les  contributions  des  points  du  voisinage, 
ponderees  par  le  coefficient  associe,  puis  en  normalisant  le  resultat.  Les  calculs  se  font  en  represen¬ 
tation  entier  etendu,  c' est-a-dire,  sur  32  bits. 

Les  donnees  radiometriques  de  1* image  brute  accedees  par  cette  etape  d’ interpolation,  sont  stockees  dans 
un  tampon  circulaire  mis  &  jour  aprSs  le  traitement  de  chaque  ligne.  Une  table  de  pointeurs,  prepar6e 
lors  de  1 ’ initialisation,  permet  d'acceder  au  premier  point  d'une  ligne  donnee  au  sein  du  tampon  circulaire. 


Interpolation  radiometrique 

Les  coefficients  d' interpolation  bilineaire  sont  calcuies  en  prenant  comme  hypothdse  la  conservation  del'e- 
nergie  du  signal  lors  de  cette  interpolation  par  une  fonction  cubique.  On  supposera  en  outre  1 ' independence 
en  x  et  en  y  de  la  fonction  d' interpolation,  la  derivation  des  coefficients  etant  presentee  dans  une  dimension. 

Si  f (x)  est  la  fonction  d' interpolation,  F(x)  sa  primitive,  l'hypothese  de  conservation  d'energie  du  signal 
sur  les  4  intervalles  conduit  au  systeme  d' equations  : 


aj  =  F (- 1)  -  F(-2) 

a2  =  F< 
a3  =  F< 

a.  =  F(2>  -  F ( 1 ) 


-  F  (-1 ) 

f  (X) 

*  ax3  + 

6x2 

+  yx 

+  6 

-  F  (0) 

avec 

4 

3 

2 

F  (x) 

-  <4  - 

+  5x  +  e 

La  resolution  de  ce  systeme  d' equations  lineaires  conduit  au  resultat  suivant  : 
a  ”  £  <-*1  +  3a2  -  3a3  +  V 
8  “  ?  <al  '  a2  '  *3  +  V 
Y  *  Y2(al  “  15a2  +  1Sa3  “  a4) 


T2(-“l  +  7a2  +  ?a3  *  V 
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La  fonction  d ' interpolation ,  issue  de  cette  hypo these  de  conservation  d ' dnergie  se  ddduit  das  coefficients 
associds  aux  radiomdtries  des  points  du  voisinage  come  suit  : 


f  (x) 


,  1  3  1  21 

*1<_6  *  +  4X  +T2  “ 


1  ,  .  ,13  12 

12  *2(2  X  "  4  X 


7 

12 


a3(-  \  x3 


1  2  ^  5  .  7  . 

4  X  *  4  X  +  T7  > 


,  .1  3  A  1  2  1  1  . 
+  a4(6  X  +  4  X  "  12  X  '  T2> 


Cette  fonction  comprend  4  composantes  qui  satisfont  au  critere  de  continuity  aux  bornes  (i.e.  P2(l/2)  * 
P4(l/2)>.  Elle  est  calcuiee  pour  un  maillage  0x8  autour  de  l'origine  et  stockde  dans  un  fichier. 


Extension  de  l'algorithme  de  rectification  aux  fortes  deformations 

Pour  de  faibles  deformations,  les  donnees  "image  brute"  sont  stockees  ligne  par  ligne  dans  le  tampon  cir- 
culaire  :  ce  tampon  est  mis  k  jour  apr£s  rectification  d'une  ligne,  par  liberation  des  lignes  non  utilis6es 
et  remplissage  par  des  lignes  nouvelles  d' indice  plus  eieve. 

Dans  cette  methode,  la  taille  du  tampon  determinera  1 'angle  maximal  de  rotation  locale  d'une  ligne.  Sur  le 
"Mitra  15",  on  est  ainsi  limite  k  22  lignes  d' image,  correspondent  k  une  rotation  de  moins  de  3  degres. 

L' extension  de  1' algor ithme  k  des  deformations  importantes  consiste  k  effectuer  le  traitement  par  paves  au 
lieu  de  le  faire  s6quentiellement  sur  une  ligne  de  la  memoire.  La  taille  des  paves  est  ici  conditionn6e  par 
les  options  disponibles  sur  la  "Trim"  pour  acceder  k  une  zone  memoire  en  jouant  sur  le  facteur  de  zoom  et 
le  num6ro  de  partition.  Le  programme  mis  en  oeuvre  utilise  des  paves  de  256  colonnes  qui  peuvent  fttre  ac¬ 
cedes  en  indiquant  leur  num6ro  de  ligne  au  sein  d'une  partition. 

Le  traitement  se  deroule  done  sur  2  bandes  verticales  de  256  colonnes  successivement,  les  donnees  "image 
brute"  etant  stockees  dans  le  tampon  circulaire  et  mis  A  jour  dynamiquement  pour  economiser  l'espace  uti¬ 
lise.  Les  transformees  des  frontieres  verticales  de  la  bande  courant  servent  k  definir  pour  une  ligne  les 
valeurs  de  1' image  brute  k  stocker  dans  le  tampon,  le  pointeur  de  debut  de  ligne  etant  chaque  fois  calcuie 
et  conserve  dims  la  table  des  pointeurs  ligne. 

Cette  gestion  dynamique  de  la  table  des  pointeurs  offre  l'avantage  que  le  programme  m&oe  de  rectification 
reste  inchange,  1 ' implantation  complexe  des  donnees  ne  se  repercutant  que  sur  la  gestion  de  ces  pointeurs 
et  sur  le  calcul  des  coordonnees  du  premier  point  d'une  ligne  avant  la  boucle  iterative  sur  les  colonnes. 

En  partitionnant  1' image  en  2  bandes  verticales,  on  penalise  le  temps  de  traitement  d'une  lecture  suppie- 
mentaire  de  toute  1' image.  Les  rotations  locales  peuvent  ainsi  atteindre  des  valeurs  de  l'ordre  de  2,22/256 
soit  =  12  degres. 


Ill  -  EXTRACTION  INTERACTIVE  DES  INFORMATIONS  DE  TYPE  GRAPH I QUE 

On  decrit  ici  une  methode  de  detection  interactive  des  traits  sur  des  images  satellite  de  la  surface  ter- 
restre.  Ces  traits  representent  en  general  des  routes,  des  autoroutes,  des  rivieres,  des  canaux  ou  des 
voies  ferrees,  done  des  voies  de  communication,  leur  forme  et  leur  fonction  etant  d'ailleurs  etroitement 
liees.  C'est  cette  fonction  de  communication  qui  fait  l'interfit  strategique  d'une  telle  detection. 

L 'observation  d'une  image  satellite  (ou  d'une  carte  de  mftme  echelle)  montre  que  la  partie  humaine  (e'est- 
A-dire  artificielle)  de  la  surface  representee  prend  la  forme  d'un  graphe  dont  les  noeuds  sont  les  villes 
et  les  branches  les  differentes  voies  de  communication  entre  ces  villes  ;  les  grandes  masses  homogenes  de 
1* image  sont  naturelles  :  surfaces  d'eau,  forets,  zones  agricoles.  Les  villes  sont  relativement  bien  de- 
tectees  par  les  methodes  multispectrales,  par  contre  le  suivi  des  routes  est  beaucoup  plus  difficile  &  6ta- 
blir  par  ces  seules  methodes  pour  des  raisons  de  finesse  ou  de  contras te  ;  pour  6viter  un  trace  en  pointilie 
l'algorithme  de  detection  a  besoin  de  considerations  geometriques  et  structurelles  qui  1 ' "aident"  k  retrou- 
ver  le  trace  cherche. 

Des  recherches  sur  la  detection  des  traits  ont  d£j&  ete  menees  dans  plusieurs  directions.  Sur  les  traces 
binaires  (chambre  k  bulles) .  La  theorie  des  graphes  sert  k  la  construction  des  lignes  par  croissance.  Sur 
des  images  k  plusieurs  niveau  de  gris,  plusieurs  methodes  ont  ete  utilisees  :  masquage  sur  une  image  de 
gradients,  determination  des  composantes  connexes  ayant  un  rapport  surface  sur  perimetre  tr6s  petit  ou  en- 
fin  detection  de  segments  suivi  d' algor ithmes  de  jonction  dans  des  directions  proviiegiees. 

La  methodes  presentee  est  un  suivi  de  trait  ou  plus  exactement  le  suivi  d'un  ev6nement  k  une  dimension  se 
propageant  sur  le  domaine  de  1' image.  Pour  un  trait  cet  6v£nement  est  schematise  par  la  fonction  creneau 
<— JL ) ,  coupe  du  trait  suivant  une  ligne  ou  une  colonne  de  1' image.  Ce  suivi  est  initialise  par  l’opera- 
teur  qui  designe  le  point  de  depart  du  trait  et  sa  direction.  Le  suivi  est  alors  effectue  segment  par  segment. 

A  chaque  etape,  l'algorithme  recherche  ligne  par  ligne  et  par  correlation  le  point  oil  le  signal  se  rapproche 
le  plus  de  I'evdnement  en  conservant  au  mieux  la  direction  precedente.  Par  regression  on  obtient  deux  nou¬ 
velles  directions  du  trait  k  court  et  k  long  terme.  L'ecart  entre  ces  directions  modifie  la  vitesse  de  pro¬ 
pagation  du  suivi  et  est  une  indication  de  la  courbure  du  trait.  Ce  travail  est  effectue  dans  une  sous  zone 
de  travail  orientee  dans  une  des  4  directions  principales  de  1* image  de  maniere  k  ce  que  le  trait  soit  pres- 
que  perpendiculaire  aux  lignes  de  la  sous  zone,  lorsque  le  trait  s'ecarte  trop  de  cette  direction,  une  nou- 
velle  orientation  est  prise. 


Initialisation 


L'opdrateur  ddsigne  sur  1 ' Scran  la  point  de  ddpart  du  trait  et  un  second  point  fixant  la  direction  du  trait 
et  1 ' avancement  initial  de  1' algorithms. 


Sous- zone  de  travail 

La  sous-zone  de  travail  est  un  carrd  de  64  x  64  pixels,  orientde  auivant  l'une  des  4  directions  principales 
de  l'image  :  verticale  descendant  ou  ascendant,  horizontals  droite  ou  gauche.  A  cheque  4 tape  un  test  de  dd- 
bordeaent  de  zone  est  effectud,  au  cas  de  ddpassement  une  nouvelle  sous-zone  est  extraite  de  l'image. 

Le  travail  de  recherche  du  trait  est  effectud  dans  le  repdre  lid  &  cette  sous-zone,  les  changementa  de  re- 
pdre  sont  effectuds  au  ddbut  et  &  la  fin  de  la  recherche,  4  cheque  dtape. 


Cfine  de  visde 

Le  cCne  de  visde  est  le  triangle  ddfini  par  : 

-  le  sonnet  :  le  dernier  point  du  tracd  M 

-  la  hauteur  issue  de  H  :  1* avancement  V 

-  la  direction  de  la  mddiane  issue  de  M  :  la  direction  8  du  dernier  segment  tracd 

-  la  longueur  du  cfitd  opposd  A  M  :  6$/ (V  +  1) 

Le  cOne  de  visde  est  le  domaine  de  recherche  du  nouveau  point  du  trait.  L'angle  d'ouverture  du  cfine  ddfinit 
la  variation  maximale  de  direction  du  trait,  il  est  ddcroissant  en  fonction  de  1 'avancement  ;  en  effet  un 
faible  avancement  est  lid  <1  une  courbure  forte  done  A  une  possibilitd  de  forte  variation  de  direction. 


Noddle  de  correlation 

Un  moddle  du  trait  (pour  le  tracd  passd)  est  calculd  par  moyennage  des  profils  du  trait  pour  les  lignes  du 
tracd  prdeddent  ayant  une  correlation  assez  forte  signal-moddle. 

Le  moddle  est  calculd  en  recalant  chaque  ligne  de  manidre  A  faire  colncider  les  positions  du  centre  du  trait 
A  chaque  ligne.  Le  centre  du  trait  est  corrigd  de  manidre  4  colncider  avec  un  extremum  du  signal. 

Le  moddle  est  la  fonction  C  ddfinir  par  : 

C(x)  =[Z  sk  (pk  -  *■  +  x)  ]/n  xe[l,2f.) 

ligne  k 

od  Sk  eat  le  signal  A  la  ligne  k 

i  la  1/2  largeur  du  moddle  est  le  centre  du  trait  4  la  ligne  k 

Pnr  le  nombre  de  lignes  bien  corrdldes  4  1' dtape  prdeddente 

Pour  diminuer  1' influence  du  bruit  et  augmenter  1'inertie  du  moddle,  le  noddle  de  corrdlation  utlisd  est 
finalement  t 

C  »  (C  +  Cp)/2  od  Cp  est  le  moddle  de  1' dtape  prdeddente. 


Corrdlation 

La  corrdlation  est  effectude  ligne  par  ligne.  On  recherche  sur  le  segment  de  droite  compris  dans  le  cfine 
de  visde  le  point  od  le  moddle  du  trait  et  le  signal  sont  le  plus  proches  (le  moddle  dtant  centrd  sur  ce 
point).  Soit  p  l'abscisse  du  point  testd  : 

.  Corrdlation  4  gauche  : 

l 

corg(p)  “  Z  a(x)/s(p-l+x)-c(x)/ 

.  Corrdlation  4  droite  : 

21 

cord(p)  .»  x_£+1  a(x)/s(p-8,+x)-c(x)/ 


! 

t 
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00  t  cat  Xa  1/2  largeur  du  module 
c  Xa  models 

a  une  fonction  poids  qui  valorise  Xa  contribution  du  centra  du  trait. 

On  a  aXors  Xa  correlation  au  point  p  : 

cor(p)  -  max  (corg(p)  +  cord(p))  +  2  min  (corg (p)  +  cord(p))/3 

On  cherche  alors  Xe  point  de  Xa  ligne  ayant  Xa  meilleur  correlation  soit  Xa  min  de  cor(p). 

Mac  1  -  min  ((cor(p))  p€|cjc2|)  Ipic  1  -  p  tg  cor(p)  •  Mac  1 

Mac  2  -  min  ((cor(p))  pc | c jC2 | -Ipicl)  Ipic  2  ■  p  tg  cor(p)  »  Mac  2 

Le  maximum  de  correlation  pour  Xa  ligne  est  alors  : 

Mac  »  Mac  1  pour  Xe  point  Ipic  »  Ipic  1  sauf  si  Mac  2  est  trOs  proche  de  Mac  1  et  Ipic  2  plus  proche  du 
pic  de  la  ligne  prOcedente  que  Ipic  1,  alors  Mac  -  Mac  2  et  Ipic  «  Ipic  2. 

Ceci  est  effectue  pour  les  V  Xignes  du  cdnes  de  visde,  on  obtient  done  un  tableau  : 

Ix(k)  «  Ipic  pour  la  ligne  k  si  Mac  <  Scor 

lx  (k)  =  <)>  pour  la  ligne  k  si  Mac  >  Scor 

oO  Scor  est  un  seuil  variable  calcuie  d'aprOs  le  pourcentage  de  bonne  correlation  demande  par  l'operateur 
et  fonction  de  la  largeur  du  trait  et  du  contraste. 

On  calcuie  enfin  le  taux  de  bonne  correlation  : 

**  ”  v  (ligne  k  tg  Ix(k)  >  $) ) 


Calcul  de  la  direction  du  trait 

On  cherctye  les  droites  de  regression  des  Ix(k)  soient  : 

D2  droite  de  regression  des  lx  (k)  k  <*  1 ,  v/2 
Dj  droite  de  regression  des  Ix(k)  k  =  1,  V 

La  droite  cherchee  est  celle  passant  par  M  et  dont  la  direction  minimise  la  sotmne  des  carres  de  distances 
A  la  droite  des  lx (k) . 


On  obtient  2  directions  de  visee  :  3j  A  long  terme  et  0 2  4  court  terme,  d'oO  le  calcul  de  Id  parametre  in¬ 
diquant  la  stabilite  directionnelle  : 


oO  Sj  est  un  seuil  donne 


Id  mesure  la  distance  A  partir  de  M  telle  que  D ^  et  D,,  ne  soient  pas  6cart4es  de  plus  de  Sj. 


Avancement 

L'avancement  V  va  determiner  le  nouveau  efine  de  vis£e  c'est-4-dire  la  possibilite  d‘ acceleration  du  trace 
dans  l'etape  suivante  : 

Le  nouvel  avancement  V'  est  : 

V'  -  Tx(Id  +  v/2) 

Pour  eviter  les  accelerations  trop  fortes  on  pondere  cet  avancement  par  un  coefficient  d'inertie  Ks 
Ks  -  12/ (V*  +  1)  et  v'  -  V'  +  Ks 

Le  segment  trace  a  alors  pour  origins  le  dernier  point  trace  et  pour  extremite  le  point  : 

(k,Ix(k)>  oO  k  est  la  ligne  tg 


k  -  max  Uk  tg  lx(k)  >  $  et  k  4  V/2)) 
k 
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Changenent  de  direction 

One  trop  grande  valeur  des  paramdtres  0 ^  et  8^  determine  un  changenent  de  direction,  pour  cela  on  teste  la 
valeur  : 

T  -  |e2|  +  i-25|e1 | 

T  <  2,5  ■+  pas  de  changement  de  direction 
T  >  2,5  -*•  changement  de  direction 

Soient  Mle  point  trouve,  MP  le  point  precedent,  MPP  ie  point  precedent  MP  et  Op  1 ’angle  de  direction  pre¬ 
cedent  : 

Si  0p  <  1  on  pose  Ml  =  MP#  M2  =  M  et  on  reprend  1 ' initialisation  d'un  trait  avec  changement  de  direction 
de  la  sous-zone  de  travail. 

S  0p  >  1  on  pose  Ml  =  MPP,  M2  =  MP  la  reinitialisation  a  lieu  avec  un  retour  en  arridre. 


Arr6t  du  trace 

L' arret  du  trace  se  fait  dans  les  cas  suivants  : 

(i)  -  La  direction  ©2  est  exterieure  au  c6ne  de  visee. 

(ii)  -  L'avancement  est  trop  faible  (en  general  car  0^  /  0^) • 

(iii)  -  L’algorithme  ne  trouve  pas  de  point  valable  dans  le  demi  cone  de  visee. 

(iv)  -  Le  trait  sort  des  limites  de  1* image. 

(V)  -  Le  nouveau  modeie  du  trait  a  un  contraste  trop  faible. 

(Vi)  -  Le  taux  de  bonne  correlation  Tx  est  trop  faible. 


Dans  les  cas  (iii)  et  (Vi)  il  est  prevu  une  nouvelle  tentative  de  recherche  du  trait  :  si  1 0 ^ |  et  |0  |  sont 
assez  grands  on  effectue  un  essai  de  trace  dans  la  direction  voisine  en  adoptant  la  strategie  utilisee  pour 
le  changement  de  direction  (avec  retour  en  arriere) . 

A  la  fin  du  trace  on  rend  la  main  d  1 ' opSrateur  qui  d&finit  la  suit  du  programme. 

Cette  methode  permet  le  trace  de  traits  d'6paisseur  quelconque  et  d'adapte  aux  changements  de  contraste  au 
cours  du  suivi  du  trait,  c'est  son  principal  interfit  lie  a  sa  rapidite  due  au  caractere  mono-diraensionnel 
du  traitement.  Le  trace  du  trait  s’ effectue  a  vitesse  variable,  rapide  quand  le  trace  est  lineaire,  plus 
lent  quand  sa  direction  varie.  Les  probiemes  sont  lies  d  la  gestion  de  la  direction,  soit  dus  au  manque 
d* adaptation  de  la  vitesse  du  trace  d  la  courbure  du  trait,  soit  dus  d  une  direction  du  trait  intermediaire 
^ntre  deux  orientations  de  la  sous-zone  de  travail  (qui  n'a  que  4  directions  possibles) .  Une  amelioration 
serait  done  apportee  par  la  consideration  de  8  orientations  pour  la  sous  zone  de  travail  (k  x  |  au  lieu  de 

k  y)  mais  cela  alourdirait  cons iderablement  le  programme.  II  est  certain  toutefois  que  l'ordinateur  n'ayant 

pas  la  capacite  de  synthese  de  l'oeil  humain  -  qui  "construct"  le  trait  d'apres  une  vision  globale  de  l'i- 
mage  -  Toute  methode  de  detection  des  traits  se  heurte  au  probl^me  d'une  detection  purement  locale,  tou- 
jours  partielle. 
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RESEARCH  INTO  METHODS  OF  IMAGE  PROCESSING  FOR  TARGET 
ENHANCEMENT  AND  DETECTION 


by 

D.B.Duke,  AJ. Fryer  &  P. A. Bird 
British  Aerospace  Dynamics  Group 
Infra-Red  Equipment  Division 
Hatfield,  Hertfordshire,  England. 


1.  INTRODUCTION 

This  presentation  is  aimed  at  showing  some  of  the  image  processing  techniques  developed  and  used  by  the  infra-red 
equipment  division  of  British  Aerospace  at  Hatfield.  Work  in  image  processing  has  been  going  on  in  the  division  since  the 
early  1970’s  and  has  concentrated  initially  on  techniques  for  optimising  the  performance  of  light  weight  thermal  imagers 
for  battlefield  surveillance.  Recent  work  however  has  been  aimed  at  developing  methods  of  enhancing  targets  with 
respect  to  their  backgrounds  for  ease  of  detection  and  recognition.  This  has  ultimately  led  on  to  our  current  work  to 
target  acquisition  and  tracking.  The  paper  is  divided  into  two  sections: - 

(i)  Methods  of  Sensitivity  Enhancement 

(ii)  Methods  of  Spatial  Resolution  Enhancement. 

Use  of  these  methods  for  improving  image  interpretability  and  automatic  target  detection  are  briefly  discussed. 


2.  SENSITIVITY  ENHANCEMENT 

The  Company  has  been  interested  mainly  in  thermal  infra-red  systems  where  the  noise  has  two  main  components, 
an  added  white  Gaussian  component  due  to  background  photons  and  a  component  inversely  proportional  to  some 
function  of  frequency. 

The  latter  noise  component,  in  systems  with  visual  displays,  is  particularly  objectionable  because  of  the  high 
frequency  effect  it  causes  in  the  direction  normal  to  the  raster  lines  due  to  imbalance  between  lines.  It  is  reduced  by 
'D.C.  restoration’  where  the  detectors  are  periodically  shown  a  reference  radiation  source  and  the  input  to  the  display 
simultaneously  biased  to  a  defined  level  and  the  bias  clamped.  Where  an  effective  approach  of  this  sort  is  possible  the 
main  residual  inter-line  imbalance  may  be  due  to  the  white  noise  component  at  the  time  the  bias  is  determined  and  this 
can  be  minimised  by  one  of  the  following  methods. 

The  low  frequency  noise  component  can  be  reduced  further  by  the  ‘overscan’  method  discussed  in  the  next  section, 
where  each  line  is  effectively  sampled  a  number  of  times. 

Another  approach  to  reduce  both  forms  of  noise  is  to  replace  each  pixel  brightness  by  some  weighted  average  of 
those  of  surrounding  pixels.  However,  this  blurs  the  image.  It  raises  the  question  of  finding  the  optimum  compromise 
between  resolution  and  noise  for  a  given  task. 

For  the  visual  recognition  task  the  Company  has  found  that,  over  a  wide  representative  range  of  input  signal  to 
noise  ratio,  deblurring  at  the  expense  of  increased  noise  does  no  harm  and  often  improves  performance.  This  is  illustrated 
in  the  section  under  'Aperture  Correction’. 

An  alternative  method  is  to  replace  each  pixel  by  the  median  rather  than  the  mean  of  its  neighbourhood.  This  does 
not  blurr  edges  or  ramps,  provided  there  is  only  one  edge  in  the  neighbourhood  at  any  instant,  but  smooths  noise  rather 
as  the  mean  does. 

Because  this  median  filter  is  non-linear  various  forms  are  possible.  For  example  several  passes  with  the  same  filter 
leave  parts  unchanged,  which  a  single  pass  does  not  change  &  progressively  smooths  the  rest.  A  cross  shaped  window  can 
be  better  than  a  square. 

The  median  filter  suppresses  impulses  very  effectively  and  so,  because  it  leaves  ramps  and  edges  unchanged  it  can  be 
used  to  enhance  compact  sources  against,  for  example,  cloud  background,  very  effectively  by  subtracting  the  filtered 
image  from  the  original. 
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There  is  a  variety  of  non-linear  noise  cheating  algorithms  which  use  logic  to  decide  whether  a  given  pixel  brightness 
is  likely  to  be  due  to  noise  and  if  so  replace  it  by  a  better  fit. 

A  method  which  evaluation  to  date  suggests  to  be  nearly  as  effective  as  a  matched  filter  without  the  matched  filter’s 
need  to  know  the  target  form,  (and  this  is  usually  not  known  in  practice)  is  to  use  a  series  of  filters,  localised  in  both 
space  and  frequency,  but  of  different  scales.  A  non-linear  method  analogous  to  the  median  filter  can  be  used  to  remove 
the  important  noise  from  these  localised  filter  outputs  without  materially  affecting  the  important  signal  content.  It  is 
then  possible  to  sum  the  filtered  outputs  and  so  reconstruct  the  scene  with  noise  greatly  reduced  but  with  resolution 
minimally  impaired. 

The  different  scales  of  the  filters  give  additional  spin  off  in  approaches  to  range  invariance  and  the  ability  to  allow 
the  output  of  the  channels  at  one  scale  to  guide  the  processing  of  the  outputs  of  other  scale  channels. 

In  automatic  recognition  and  navigation  systems,  outline  may  be  more  invariant  than  contrast  and  so  may  form  a 
simpler  recognition  criterion. 

A  first  move  in  defining  outline  may  be  preprocessing  to  reduce  noise  as  discussed  above.  This  may  be  followed  as 
a  second  move  by  determining  the  best  elemental  straight  line  edge  in  the  neighbourhood  of  each  pixel,  by  correlation 
with  masks  at  different  orientations  and  choice  of  that  orientation  giving  the  greatest  output.  This  effectively  integrates 
along  the  edge  and  further  improves  signal  to  noise  ratio,  provided  the  mask  is  not  too  long.  If  it  is  too  long,  edge 
curvature  reduces  signal  to  noise  ratio,  so  that  there  is  skill  in  the  choice  of  mask. 

The  third  move  is  likely  to  be  to  link  these  elemental  straight  edges  to  form  curves  using  logic  to  decide  which  are 
mutually  consistent  and  so  valid  and  which  are  inconsistent  and  so  due  to  noise.  The  process  further  enhances  signal  to 
noise. 

Because  there  are  three  stages  in  the  above  process,  it  is  necessary  in  assessing  an  approach,  to  include  all  3  stages. 
For  instance  some  preprocessors  make  elemental  edge  detection  less  effective  and  some  elemental  edge  detectors  do  not 
provide  good  orientation  information,  helpful  to  the  linking  stage. 

Usually  the  Signal  Transfer  Function  of  imagers  is  effectively  linear,  and  for  various  reasons  this  may  not  be 
optimum. 

One  is  that  the  eyes’  response  is  not  linear  but  more  sensitive  near  black.  Consequently  it  may  be  better  to  reduce 
image  gain  near  black  and  increase  it  near  white  to  pack  more  information  into  the  region  the  eye  can  usefully  use. 

Sometimes  the  objects  in  a  scene  can  be  divided  into  two  or  more  classes,  perhaps  natural  components  near  ambient 
and  internally  heated  components  such  as  houses  and  people  with  very  little  scene  content  between  the  classes.  An 
approach  optimising  this  situation  is  histogram  equalisation  where  the  Signal  Transfer  Function  is  made  to  follow  the 
form  of  the  cumulative  histogram  of  scene  brightness.  This  allows  the  most  used  brightness  region  in  the  scene  to  use  the 
greatest  proportion  of  the  grey  scale. 

Histogram  equalisation  can  however,  make  an  image  more  difficult  to  use.  For  example  a  histogram  equalised  chess 
board  loses  contrast  between  squares  but  amplifies  noise  within  squares. 

In  some  cases  it  is  possible  to  improve  performance  by  splitting  the  image  into  smaller  sections  each  of  which  is 
homogenous  in  its  basic  characteristics  and  histogram  equalise  them  individually.  An  important  aim  is  often  to  enhance 
interesting  objects  and  suppress  the  rest  to  minimise  the  observer’s  confusion.  Local  histogram  equalisation  however  may 
enhance  everything. 

An  analogous  approach  is  to  make  the  local  areas  ’Contrast’  and  ‘Brightness’  a  function  of,  say  local  area  standard 
deviation  and  mean  level.  Taken  to  the  extreme  this  can  turn  the  image  into  uniform  noise  and  choice  of  optimum 
control  signal  and  control  function  is  important. 

Information  within  the  tank  has  been  pulled  out  but  so  has  a  lot  of  the  background  information.  As  the  standard 
deviation  increases  the  image  begins  to  look  more  and  more  noisy  and  some  optimisation  as  to  the  choice  of  these 
parameters  is  needed . 

The  use  of  pseudocolour  as  well  as  brightness  variation  has  the  potential  to  vastly  increase  the  information  content 
of  a  display  It  is  important  to  connect  colouration  to  human  experience  or  the  result  just  confuses.  Blue  =  cold  to  red 
=  hot,  through  the  spectrum  can  have  this  property.  Not  only  does  the  red  suggest  heat  but  the  sky  can  be  blue  and 
ground  green  to  yellow.  Implementation  of  such  a  system  may  require  the  imager  to  sense  absolute  rather  than  relative 
temperatures. 

In  such  systems  it  is  necessary  to  arrange  brightness  as  well  as  hue  changes  at  edges  for  perception  to  use  them 
effectively  and  there  are  various  schemes  to  do  this. 
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2.1  Oveiscanning 

In  most  scanning  systems  it  is  usual  for  the  raster  pitch  to  equal  the  size  of  the  detector.  This  results  in  a  degra¬ 
dation  of  resolution  in  a  direction  normal  to  the  raster  and  has  serious  effects  on  the  range  capability  of  the  system. 

Overscanning,  in  which  the  pitch  of  the  raster  is  some  fraction  of  the  detector  width,  overcomes  this  problem  and 
produces  an  isotropic  image  whose  resolution  is  determined  solely  by  the  system  transfer  function.  The  idea  of  over¬ 
scanning  follows  directly  from  the  Nyquist  sampling  theorem.  Normally  a  series  of  discrete  samples  are  taken  along  the 
direction  of  the  raster  in  accordance  with  this  theorem  but  not  in  the  direction  normal  to  it.  At  least  2  samples  per 
detector  width  are  required  in  this  direction.  Overscanning  will  also  reduce  aliasing  effects  and  the  introduction  of  steps 
in  otherwise  straight  edges.  Results  showing  the  effects  of  overscanning  are  shown  in  Figure  I  for  3  different  ranges.  The 
slide  is  a  rather  poor  reproduction  but  it  can  be  seen  that  overscanning  produces  a  marked  improvement  in  image  quality. 
Imagery  at  a  range  of  1 .5  km  with  overscanning  is  similar  in  interpretabilily  to  that  al  0.75  km  without  overscanning  thus 
doubling  the  range  capability. 

Overscanning  is  also  desirable  where  it  is  necessary  to  register  successive  frames  of  data  in  order  to  track  objects. 

It  is  possible  that  a  point  target  which  had  previously  completely  covered  1  detector  would,  in  the  next  frame  lie  between 
2  detectors  and  therefore  these  pixels  would  only  have  half  the  energy.  Comparison  of  pixels  therefore  when  registering 
would  become  difficult.  With  overscanning  at  least  one  pixel  would  have  the  full  information. 

2.2  Aperture  Correction 

A  non-recursive  spatial  filter  for  the  removal  of  system  blur  was  produced.  This  consisted  of  a  3  x  3  weighted 
matrix  designed  to  collect  energy  falling  on  adjacent  detectors,  due  to  the  spread  of  the  system  optics,  and  to  add  this 
back  into  the  central  detector. 

Figure  2  shows  the  way  in  which  this  was  done.  3  adjacent  detectors  are  labelled  A,  B  and  C  in  this  diagram. 

B  -  A  +  C 

The  freq.  response  of  the  centre  element,  B,  is  shown  by  the  full  curve  and  that  of  the  function  — - —  is  shown 

by  the  broken  curve. 

Weighting  this  function  by  some  factor  k  and  adding  to  B  results  in  an  MTF  with  a  better  high  freq.  response. 

Trial  and  error  have  shown  that  a  value  of  K  of  0.3  is  optimum. 

Figure  3  shows  the  results  of  this  'aperture  correction’.  A  sharpening  effect  is  noticeable  making  interpretation 
easier. 

Previous  workers  in  aperture  correction  do  not  appear  to  have  shown  any  useful  improvement  in  recognition  and 
it  is  believed  that  this  was  because  raster  overscan  had  not  been  employed. 

Figure  4  shows  the  effects  of  aperture  correction  on  noisy  imagery.  Improvement  is  still  noticeable. 


3.  ENHANCEMENT  OF  SPATIAL  RESOLUTION 

Usually  the  Modulation  Transfer  function  of  the  system  can  be  measured.  If  so  it  is  possible  to  apply  an  inverse 
filter  to  correct  for  it.  Obviously  at  frequencies  where  the  transfer  function  is  so  poor  that  all  signal  is  lost  in  noise 
the  pure  inverse  filter  does  more  harm  than  good  and  some  alternative  strategy  is  needed  for  these  regions. 

A  particular  case  which  the  Company  has  studied  is  that  where  the  Modulation  Transfer  function  is  defined  largely 
by  the  finite  size  of  the  detector  element. 

This  may  be  true,  for  example,  of  an  optimised  staring  matrix  array  detector.  In  the  early  development  stage  of 
such  detectors  it  may  be  difficult  to  produce  a  good  yield  of  large  arrays.  Perhaps  the  diameter  of  the  array  may  be 
limited  to  a  few  tens  of  element.  In  this  case  there  may  be  conflict  between  field  of  view  needed  and  resolution  needed. 
The  best  compromise  may  be  to  meet  the  field  of  view  requirement  and  compensate  detector  blurring  by  a  modified 
inverse  filter. 


(sin  x) 

The  Modulation  Transfer  function  of  a  square  detector  is  a  two  dimensional  sine  function,  -  ,  an  oscillation 

x 

about  zero  dying  away  slowly. 


Simulation  has  shown  that,  in  this  case,  and  making  a  reasonable  input  signal  to  noise  ratio  assumption  resolution 
can  be  more  than  doubled. 


Figure  5  shows  a  typical  MTF  for  a  6X  overscanned  image  and  Figure  6  shows  the  corresponding  inverse  filter. 


The  bottom  image  in  Figure  7  shows  the  6X  overscanned  image  used  for  the  filtering  work  and  Figure  8  shows  the 
results  of  inverse  filtering. 

Noise  at  43  cycles  per  picture  width  occurs  on  ail  these  images  due  to  a  slight  mismatch  between  the  zeros  of  the 
MTF  and  the  inverse  filter. 

(a)  is  the  result  of  applying  the  inverse  filter  in  Figure  6.  The  next  2  images  (b)  and  (c)  have  the  maximum  gain 
of  the  filter  limited  to  ±20X  and  this  has  had  little  effect  on  the  image. 

In  order  to  see  how  much  information  is  contained  in  the  higher  frequencies  successive  lobes  of  the  inverse  filter 
were  removed. 

(d)  shows  the  effect  of  removing  the  information  beyond  the  3rd  zero  crossing 

(e)  information  beyond  the  2nd  zero  is  removed  and 

(f)  information  beyond  the  1st  zero  is  removed. 

It  can  be  seen  that  useful  information  is  available  beyond  the  1st  zero  of  the  MTF  and  it  is  important  to  retain 
this  information  -  beyond  the  2nd  zero  however  little  is  gained  by  its  retention. 

Image  (h)  shows  the  importance  of  retaining  phase  information.  Here  the  phase  of  the  2nd  lobe  has  been  made 
positive. 

Image  (g)  just  shows  the  effects  of  tapering  the  2nd  lobe  of  the  inverse  filter  to  zero  at  40  c.p.w.  to  cut  out  the 
43  c.p.w.  noise. 

A  sharp  drop  to  zero  in  the  filter  function  where  the  MTF  falls  to  zero  produces  ringing  in  the  image.  The  Wiener 
filter  tries  to  overcome  this  by  automatically  reducing  the  gain  in  those  regions  where  the  signal  falls  towards  the  noise 
level.  This  filter  produces  poorer  resolution  than  the  inverse  filter  but  also  reduces  ringing.  Some  kind  of  compromise 
between  the  two  is  desirable.  This  can  be  achieved  by  using  weighted  combinations  of  Wiener  and  Inverse  filter.  When  a 
is  equal  to  a  half  in  this  equation  we  have  what  is  known  as  the  Geometric  mean  filter. 


WIENER  FILTER  = 


H(f)  + 


[Yn(f)]2 
[Ys(f)]J  .  H(f) 


where  .H(f)  =  Sine  (rr/fc),  fc  =  freq.  of  1st  zero  of  system  MFT 

( Yn(f)]2  =  image  noise  power  spectrum 

[  Ys(f)]  =  image  signal  power  spectrum 

(Inverse)0  .  (Wiener)1  — 01  (3.2) 

Another  approach  which  we  are  presently  looking  at  is  not  to  set  the  gain  to  zero  in  the  regions  of  the  MTF  zeros 
but  to  insert  whatever  freq.  and  phase  component  gives  the  smoothest  image  or  one  having  the  noisiness  expected  at  these 
freqs.  from  the  known  system  noise  characteristics. 

When  narrow  bands  of  the  frequency  spectrum  are  set  to  zero  it  produces  a  ringing  effect  visible  to  the  eye  as  such. 
Thus  it  is  possible  to  determine  the  phase  and  amplitude  necessary  in  the  blanked  region  to  suppress  the  ringing  and  this 
may  be  a  good  guess  at  the  missing  component. 

One  way  to  implement  this  is  to  choose  the  maximum  entropy  waveform  and  the  classical  approaches  to  this  also 
ensure  positive  brightness  solutions. 

The  prior  knowledge  that  brightness  must  be  positive  produces  a  large  improvement  in  resolution,  where  it  can 
be  applied,  over  linear  restoration  systems  which  if  one  attempts  to  push  them  too  far  produce  overshoots  instead  of  a 
narrower  central  peak  to  the  corrected  system  spread  function. 

In  general  it  does  not  apply  to  a.c.  coupled  thermal  imagers.  Nor  does  it  help  improvement  of  resolution  of  objects 
against  positive  rather  than  zero  background. 

However,  in  the  final  phase  of  command  guidance,  for  example,  and  it  is  only  in  the  final  phase  that  enhanced 
resolution  may  be  needed,  missile  and  target  are  close  together  against  what  may  be  effectively  uniform  background 
and  the  system  can  adjust  its  d.c.  restoration  to  make  the  background  an  artificial  zero  level.  In  these  circumstances  a 
resolution  of  four  times  better  than  the  diffraction  limit  has  been  claimed' . 
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In  developing  methods  of  target  enhancement  with  respect  to  background  a  study  was  carried  out  on  the  spatial 
frequency  characteristics  of  typical  targets  and  backgrounds. 

A  characteristic  mottling  effect  was  noticeable  in  the  spatial  spectra  of  vehicles  which  were  noticeably  absent  in 
those  of  backgrounds.  This  is  illustrated  in  the  following  Figures  which  show  the  modulus  of  the  Fourier  transform  and 
the  logarithm  of  the  modulus  of  the  Fourier  transform  of  typical  vehicles  and  backgrounds. 

Figure  9.  This  is  the  transform  of  a  thermal  image  of  a  vehicle.  Notice  the  mottling  effect. 

Figure  1 0  shows  the  transform  of  another  vehicle.  Again  notice  the  patterns  in  the  central  lobe. 

Figure  1 1  shows  the  transform  of  a  typical  background  scene,  plains.  No  characteristic  mottling  is  present,  the 
transform  appears  completely  random. 

Attempts  to  isolate  these  periodic  mottlings  were  made  by  Fourier  transforming  these  transforms  again.  These 
are  shown  in  Figure  1 2.  The  images  on  the  left  hand  side  show  the  modulus  of  the  Fourier  transforms  of  the  objects  and 
those  on  the  right  the  Fourier  transform  of  the  central  part  of  those  on  the  left.  The  top  two  pairs  are  of  vehicles  and 
show  marked  similarities  whereas  the  bottom  pair,  are  of  typical  background  and  are  quite  different.  (The  size  of  the 
lower  right  hand  image  was  due  to  a  horizontal  scale  change  in  the  output  program  when  this  image  was  produced.  All 
the  other  images  were  produced  at  an  earlier  stage). 

So  far  no  positive  conclusions  have  been  made  from  this  study. 


4.  REFERENCES 

1 .  Frieden,  B.R.  Picture  Processing  &  Digital  Filtering  Chapter  5,  Edited  by  T.S.Huang,  Springer-Verlag. 


r 


i 


NO  OVERSCAN  DOUBLE  OVERSCAN  TREBLE  OVERSCAN 


Fig. 3  Effect  of  aperture  correctioi. 


m 


CYCLES  PER 
PICTURE  WIDTH 


(cl  Modulus  of  transform  (d)  Logarithm  of  transform 


1  igure  I 


19-1 


IMAGE  ENHANCEMENT  IN  REAL  TIME 
by 

H.  Yndestad 

Norwegian  Defence  Research  Establishment 
Po  Box  25 
2007  Kjeller 
Norway 


SUMMARY 

This  paper  presents  an  algorithm  and  a  planned  computer  architecture  for  image  enhance¬ 
ment  in  real  time.  The  algorithm  for  image  enhancement  will  reduce  dynamic  range  and  en¬ 
hance  contrast  in  an  image  by  homomorphic  filtering. 

The  filter  function  is  separable  in  column-  and  in  line-direction,  and  the  filter  is 
implemented  as  a  linear  phase  frequency  sampling  filter.  Two-dimensional  filtering  can 
then  be  achieved  by  filtering  a  picture  first  line  by  line  and  then  column  by  column  in 
a  recursive  filter.  Dynamic  range  reduction  and  contrast  enhancement  are  controled  by  two 
parameters.  The  algorithm  needs  32  arithmetic  operations. 

The  image  processor  is  a  set  of  processing  elements,  each  having  a  specific  task.  One 
such  task  can  be  the  described  algorithm  for  homomorphic  filtering.  The  processing  element 
is  built  up  by  a  micro-processor  for  local  control,  a  bit-slice  processor  for  data  transfer 
control  and  a  specialized  arithmetic  unit  for  signal  processing.  Data  transfer  between 
processing  elements  is  carried  out  in  a  high  speed  ring  network. 

1  INTRODUCTION 

The  illumination  on  a  scene  often  varies  strongly.  The  dynamic  range  of  imaginq  sensors 
can  be  about  50  dB  or  more.  A  display  device  such  as  a  cathode-ray  tybe  typically  has  a 
dynamic  range  less  than  30  dB.  When  a  video  signal  from  a  wide  dynamic  range  sensor  is  to 
be  displayed  on  a  display  device,  some  form  of  processing  is  therefore  required  to  minimize 
the  loss  of  information. 

Spatial  homomorphic  filtering  is  a  method  that  has  good  dynamic  range  reduction  and  con¬ 
trast  enhancement  properties  with  a  minimum  loss  of  information  [1],  Because  of  the  good 
properties  of  this  method,  it  has  been  of  interest  to  design  an  image  processor  that  can 
do  the  necesary  processing  in  real  time. 

Spatial  filtering  in  real  time  requires  an  image  processor  with  a  very  high  processing 
capacity.  It  is  therefore  important  to  have  a  filter  algorithm  which  is  easy  to  implement, 
and  the  image  processor  must  have  an  architecture  that  gives  flexibility  and  a  high  proces¬ 
sing  capacity. 

2  DEFINITIONS 

An  illumination  component  is  a  digital  representation  of  the  illumination  on  a  scene. 

A  reflectance  component  is  a  digital  representation  of  the  reflectance  at  the  scene. 

A  picture  is  a  two-dimentional  array  of  numbers  that  is  a  sampled  and  digized  represen¬ 
tation  of  the  luminance  from  a  scene.  A  picture  element  is  an  element  of  this  array. 

3  HOMOMORPHIC  FILTERING 

The  picture  elements  can  be  written  [1]  as 
x(n,m)  =  x^fn.m)  •  xr(n,m)  (1) 

where  n  and  m  are  the  discrete  spatial  variables. 

The  illumination  component  x, (n,m)  is  always  positive  and  has  a  wide  dynamic  range  when 
there  are  shadows  in  the  scene.  In  most  cases  this  component  has  most  of  its  energy  at  the 
lowest  spatial  frequencies  in  the  picture  [1]. 

The  reflectance  component  xr(n,m)  is  always  above  zero  and  below  one.  This  component 
modulates  the  contrast  in  the  picture  and  it  is  spatially  more  high  frequent  than  the  il¬ 
lumination  component.  This  gives  the  possibility  of  a  dynamic  range  reduction  and  at  the 
same  time  a  contrast  enhancement,  by  some  kind  of  high  pass  filtering. 

There  are  some  difficulties  with  a  direct  linear  filtering  of  the  picture.  A  linear 
filtering  will  introduce  both  negative  and  positive  element  values.  A  second  problem  is 
that  the  illumination  component  xi(n,m)  and  the  reflectance  component  are  multiplied.  It 
can  be  shown  [2]  that  it  is  easier  to  filter  them  independently  when  they  are  added. 

Both  of  these  problems  can  be  solved  by  taking  a  logarithmic  transform  of  the  picture 
element  values  before  the  linear  filtering.  This  gives  a  new  set  of  components  which  are 
additive . 

g(n,m)=ln[x(n,m) ]=ln[xi(n,m) ]+ln[xr(n,m) ] 


(2) 
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If  a  linear  filter  modifies  the  illumination  component  by  a  factor  B  and  the  reflectance 
component  by  a  factor  A,  then  the  filtered  picture  elements  in  additive  form  will  be 

p(n,m)=B*ln[xi (n,m) ]+A-ln[xr(n,m) ]  (3) 

The  modified  components  are  then  transformed  back  to  multiplicative  form  by  an  anti- 
logarithmic  transform 

y (n,m) =exp[p(n,m) ]=[xi (n,m) ]B- [xr (n,m) ]A  (4) 

The  dynamic  range  is  reduced  when  B  <  1,  and  contrast  is  increased  when  A  >  1.  This 
type  of  filtering  is  called  homomorphic  filtering. 

3.1  The  linear  filter 

A  simple  way  to  make  the  linear  filter  is  to  subtract  spatially  low-pass  filtered  ele¬ 
ments  from  the  unflltered  elements  as  illustrated  in  figure  1.  This  can  be  expressed  as 


p (n,m) =A-  (lnlxj^ <n,m)  ]+ln[  xr  (n,m)] )-  (A-B)  •  (lnfx^  (n,m)  ]+ln[xr  (n,m)  ] )  *h (n,m)  (5) 

where  A  and  B  are  constants,  *  is  the  convolution  operator  and  h(n,m)  is  the  impulse  res¬ 
ponse  of  the  low  pass  filter. 

If  there  is  no  overlap  between  the  spatial  frequences  of  the  illumination  and  the  ref¬ 
lectance  component,  a  low  pass  filter  h(n,m)  can  be  found  that  will  suppress  the  reflec¬ 
tance  component.  Then  (5)  can  be  modified  to 

p(n,m)=B-ln[xi (n,m) ]+A-ln[xr(n,m) ]  (6) 

which  is  identical  to  (3) . 

In  a  practical  filtering  there  will  be  some  overlap  between  the  spatial  frequences  of  the 
two  components.  This  will  limit  the  possibility  of  increasing  the  contrast  in  the  picture. 

3.2  The  low-pass  filter 

A  useful  filter  function  for  the  low-pass  filter  is  a  normalized  Gaussian. 

H  (k,  1)  =  exp(-(k2  +  l2)  1/2  a2)  (7) 

I 

where  k  and  1  are  the  discrete  spatial  variables  and  o  is  a  selected  constant. 

This  filter  function  has  two  important  properties.  It  is  separable  in  line-  and  column- 
direction.  Twodimensional  filtering  can  then  be  achieved  by  filtering  the  picture  first 
line  by  line  and  then  column  by  column.  This  is  of  importance  in  a  real  time  processing, 
as  the  number  of  arithmetic  operation  is  much  reduced  compared  to  a  circular  convolution. 

The  second  important  property  by  this  function  is  that  it  is  rotationally  invariant. 

The  low-pass  filter  can  be  implemented  as  a  frequency  sampling  filter  [3].  This  filter 
is  easy  to  implement  and  to  approximate  to  a  Gaussian  filter  function.  The  bandwidth  is 
tuneable  by  one  parameter  and  it  can  be  implemented  to  have  exact  linear  phase. 

The  difference  equation  for  a  simple  frequency  sampling  filter  can  be  written  as 

p(n)  =  a(n)  +  b(n)  (8) 

where 

a(n)  =  r-a(n-l)+(g(n)-rN.g (n~N) /N 

b  (n)  =  2  -  r  •cos(2n/N)  -b  (n-1) -r  -b  (n-2)  -  (g  (n)  -r  *g  (n-N) )  •  (g  (n)  -r  .g  (n-1) )  -  cos  (n/N)  •  |  H  (1)|/N 
where 

g(n)  is  the  input  signal 

I H ( 1) |  is  a  sample  of  the  normalized  Gaussian. 

N  is  the  number  of  samples  of  the  impulse  response.  This  parameter  controls  the  bandwith 
of  the  filter. 

r  is  a  constant  less  than  one. 

The  linear  phase  property  of  (8)  will  introduse  a  delay  of  N/2  samples.  The  equation 
(5)  will  then  be  modified  to 

p(n,m)  =  A* (ln[x^(n-(N-l)/2,  m- (N-1 ) /2 ) ]  +  ln [ xf (n- (N-1 ) /2 ,  m-(N-l)/2J) 

-(A-B) • (lntx^ (n,m) )+ln[xr (n,m) ]) *h(n,m)  (9) 

where  N  is  an  odd  number. 
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This  filter  algorithm  needs  32  arithmetic  operation  per  picture  element.  It  has  three 
parameteters .  In  (9)  B  will  control  dynamic  range  reduction,  A  controls  contrast  enhance¬ 
ment  and  in  (8)  N  will  control  the  bandwidth  of  the  low-pass  filter. 

4  THE  IMAGE  PROCESSOR 

A  general  purpose  computer  does  not  have  the  necessary  compuing  capacity  for  image  pro¬ 
cessing  in  real  time.  A  multiprocessor  configuration  will  have  the  computing  power,  but 
the  practical  implimentation  will  be  complex. 

To  reduce  volume,  costs  and  power  consumtion,  the  image  processor  has  to  be  specialized. 

A  specialized  computer  architecture  will  lose  flexibility.  It  is  therefore  important  to 
have  an  architecture  that  is  modular. 

In  a  modular  computer  architecture  it  is  necessary  to  have  a  powerful  data  communication 
system.  Such  a  computer  architecture  is  shown  in  figure  2.  It  is  build  up  by  a  number  of 
processing  elements  (PE)  connected  to  a  ring  network.  A  processing  element  is  an  intelli¬ 
gent  unit  that  is  specialized  to  a  specific  task.  Such  a  task  can  be  a  video  digitzer, 
noise  cleaning,  homomophic  filtering  or  a  display  processor.  It  can  also  be  a  general 
purpose  computer  or  a  backend  database  system. 

The  ring  network  is  a  high  speed  packed-switched  transmission  system  for  carrying  data 
between  processing  elements.  Compared  to  a  more  traditional  bus  system,  the  ring  network 
has  two  significant  advantages.  It  requires  no  global  control  and  it  has  a  higher  data 
transfer  capacity.  More  than  one  processing  element  can  simultaneously  put  a  data  word 
into  the  ring.  The  ring  will  then  transmit  the  data  word  to  the  receiving  process  elements. 

4 . 1  The  node 

A  node  is  an  element  of  the  ring  network  which  transmits  a  data  word  between  processing 
elements.  The  node  has  a  register,  an  input  queue  and  an  output  queue  as  shown  in  figure 
3.  The  register  carries  a  data  word  by  receiving  it  from  one  node  and  sending  it  to  the 
next.  The  input  queue  receives  data  words  from  a  processing  element  and  puts  it  into  the 
register,  and  the  output  queue  receives  a  data  word  from  the  register  and  put  it  into  a 
processing  element.  A  queue  can  be  implemented  by  a  First  In-First  Out  Memory. 

The  data  word  consists  of  an  information  word,  a  sender's  adress,  a  receiver’s  adress 
and  one  bit  that  indicates  if  the  word  is  filled  or  empty. 

When  a  data  word  is  transmitted,  a  processing  element  put  it  into  the  queue.  The  first 
data  word  in  the  queue  is  put  into  the  register  when  the  node  receives  an  empty  word. 

Then  it  is  carried  around  the  ring.  When  it  arrives  to  the  adressed  node,  it  is  put  into 
the  output  queue  and  the  data  word  at  the  ring  is  marked  as  empty.  The  adressed  processing 
element  will  then  fetch  the  data  word  from  the  output  queue  of  the  node. 

4.2  The  processing  element 

A  processing  element  (PE)  is  an  intelligent  unit  that  can  either  be  a  general  purpose 
computer,  a  backend  database  processor  or  a  specialized  processor  for  a  specific  task. 

If  the  PE  is  an  processor  that  executes  homomorphic  filtering  in  real  time,  it  can  be 
organized  by  a  micro-processor,  a  bit-slice  processor  and  a  specialized  arithmetic  unit  as 
shown  in  figure  4. 

The  micro-processor  controls  an  PE.  It  will  be  able  to  communicate  to  an  other  PE, 
compute  filter  parameters  and  control  the  bit-slice  processor. 

The  bit-slice  processor  is  a  high-speed  data  transfer  controler.  It  controls  data  trans¬ 
fer  within  the  specialized  arithmetic  unit  and  the  data  transfer  to  and  from  the  node. 

All  the  signal  processing  can  be  done  in  a  specialized  arithmetic  unit,  which  is  directly 
wired  according  to  the  filter  equation.  For  each  multiplication  in  the  filter  equation, 
there  is  a  monolithic  multiplir  and  for  each  addition  there  are  adder  components.  Registers 
can  be  introduced  at  suitable  locations  to  achieve  pipelineing.  The  logarithmic  and  anti- 
logarithmic  transform  can  easily  be  obtained  by  look-up  tables  in  Read  Only  Memories.  The 
picture  is  then  filtered,  simply  by  clocking  the  picture  element  values  at  clock  rate  speed 
through  the  arithmetic  unit,  first  line  by  line  and  then  coloumn  by  coloumn. 

In  some  cases  it  may  be  possible  to  implement  the  specialized  arithmetic  unit  by  a  custom 
designed  charge  coupled  device  or  some  other  analog  processing  device. 

The  computer  architecture  gives  an  image  processor  that  has  a  relatively  small  volume 
and  power  consumtion.  The  lack  of  a  full  programable  flexibility  is  compensated  by  its 
modularity.  A  new  PE  can  easily  be  plugged  into  the  system. 
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SUMMARY 

Different  methods  for  the  extraction  of  objects  from  aerial  imaqes  are  presented.  Unlike 
other  methods  which  process  the  complete  imaqe  systematically,  we  developed  object  qoided 
methods,  which  are  applied  only  to  those  parts  of  the  imaqe  where  objects  or  parts  of 
objects  have  already  been  detected,  i.e.  where  a  continuation  of  an  object  is  probable. 

Basic  principles  as  well  as  details  of  the  extraction  methods  are  explained.  The  methods 
differ  with  respect  to  the  local  precision  of  the  results,  the  applicability  to  different 
object  types  and  the  required  imaqe  quality.  Local  operators  are  described  which  evaluate 
qrev  level  diaqrans  in  order  to  detect  object  continuations. 

The  methods  have  been  implemented  on  a  computer  DEC  PDP  11/70.  Results  are  presented. 

1.  INTRODUCTION 

Due  to  the  variability  of  objects  and  the  complexity  of  object  interrelations,  the 
interpretation  of  aerial  imaqes  is  almost  completely  reserved  to  specially  trained  human 
interpreters.  This  is  true  for  each  of  the  different  fields  of  application.  An  automatic 
svstem  to  solve  this  interpretation  task  is  not  known  to  date  (Kyoto,  1878).  In  this  con¬ 
text  the  term  'image  interpretation1  stands  for  the  hiqhest  level  operation,  which 
comprises  any  other  operation  such  as  imaqe  segmentation,  object  detection,  object 
classification,  evaluation  of  object  contours,  object  attributes  or  object  interrelations, 
etc. 

For  special  suhprohlems  of  the  interpretation  task  such  as  object  classification  without 
explicit  definition  of  object  outlines,  several  automatic  or  at  least  semi-automatic 
methods  have  been  developed  ( DFVLR,  1880 j  Ouiel,  F.,  1979).  Other  problems  related  to 
the  detection  of  line  and  reqion  objects  in  aerial  imaqery  are  investigated  by  a  qrowinq 
number  of  scieitists  (Zucker,  S.W.,  1977;  Kenq,  J. ,  1977;  Montoto,  L.,  1977;  Braconne, 
S.,  1R78 ;  Naqao,  M.,  1978;  Nevatia,  R.,  1978).  The  systematic  processing  of  the  imaqe 
matrix,  which  is  a  basic  characteristic  of  all  methods  cited,  tends  to  require  much 
computing  time  and  sometimes  leads  to  noticeable  error  rates,  due  to  the  variability 
of  the  objects. 

The  ohject  extraction  methods  to  be  described  differ  from  the  above  mentioned  methods 
in  a  number  of  basic  characteristics  (Compare  also  Guam,  L.H.,  1978). 

2.  BASIC  PROCEEDING 

In  this  paper  object  extraction  is  defined  as  the  compilation  of  coordinates  of  the 
object  outline  (contour)  or  object  centre  line  on  the  basis  of  a  step  by  step  tracinq, 
i.e.  a  continuation  of  the  contour  or  centre  line  is  evaluated  by  prediction  and 
verification  of  features  and  locations.  TTiis  procedure  requires  identification  of  at 
least  one  point  of  the  contour  or  centre  line.  The  problem  of  how  to  identify  such 
a  point  either  visually  or  automatically  (i.e.  the  initialization  of  the  procedure) 
will  be  discussed  in  section  5. 

The  first  characteristic  of  the  extraction  methods  is  their  capability  to  adapt  special 
object  features  or  at  least  special  features  of  different  object  types.  Three  different 
object  types  are  distinguished  which  make  up  most  of  the  objects  recognizable  in  an 
aerial  imaqe: 

-  line  shaped  objects  are  distinguished  by  their  extreme  lenqth/width  ratio. 

Roads,  railroads,  rivers,  etc.  belonq  to  this  object  type.  The  result  of  the 
extraction  of  an  object  of  this  typ  will  usually  be  the  object's  centre  line. 

-  reqion  like  objects  can  be  described  as  havinq  a  larqe  enouqh  area  and  a  closed 
contour  line  to  separate  them  from  other  objects.  Examples  of  this  object 

type  are  an  urban  area,  a  forest,  an  agricultural  field,  etc. 

-  point  like  objects  have  only  a  short  length  and  a  small  area  such  as  sinqle 
trees,  houses,  vehicles.  The  description  of  the  location  of  objects  of  this 
type  is  usually  confined  to  a  pair  of  coordinates. 


All  three  object  types  are  treated  differently  either  by  applyinq  different  extraction 
methods  or  by  inputting  special  feature  values  such  as  the  width  of  a  road  or  the 
specific  texture  of  a  forest.  By  the  exploitation  of  this  kind  of  a  priori  knowledge 
about  certain  object  attributes,  we  succeeded  in  developing  simple  and  fast  extraction 
methods,  which,  at  the  same  time,  produce  reliable  results. 
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It  is  another  basic  principle,  that  the  extraction  methods  do  not  process  the  complete 
image  matrix  systematically  and  uniformly,  but  work  in  limited  parts  of  the  imaqe  only. 
With  the  concept  of  a  local  initialization,  it  should  obviously  be  more  successfull  to 
to  analyze  the  very  near  neighbourhood  of  the  actual  location  to  detect  a  possible 
continuation  of  the  object  or  object  contour  than  applying  a  universal  contour  detection 
operation  to  the  complete  image  matrix,  'rtiis  limitation  of  the  area  to  be  screened  results 
in  a  considerable  reduction  of  data  to  be  processed.  On  the  other  hand  we  also  get  rid  of 
the  burden  of  searching  for  objects  or  object  contours  at  locations  within  the  image  data 
where  there  is  a  very  low  probability  to  find  one. 

Hie  basic  algorithm  of  the  extraction  methods  performs  an  analysis  of  a  locally  limited 
grey  level  function.  As  we  can  predict  the  location  and  the  most  probable  orientation 
of  further  object  parts  to  be  detected,  we  even  do  not  have  to  process  a  two-dimensional 
submatrix  of  qrey  level  values,  but  can  confine  the  algorithm  to  the  analysis  of  one¬ 
dimensional  grey  level  profiles  which  are  perpendicular  to  the  predicted  object 
orientation,  e.g.,  we  are  looking  for  the  very  next  cross  section  or  a  limited  sequence 
of  continuing  cross  sections  to  be  contained  in  the  qrey  level  profile  at  a  predicted 
location.  Details  of  the  analysis  of  the  qrey  level  profile  are  discussed  in  section  4. 


The  possibility  to  improve  f  '.ocal  precision  of  the  results  step  by  step  is  to  be 
mentioned  asthe  lastbasic  pri  •>■**».  To  accommodate  the  differing  requirements  of  local 
precision  in  the  dif ferentlf ;  .  -  >f  applications,  several  extraction  procedures  can  be 
initiated  sequentially,  each  which  uses  the  resulting  output  of  the  preceeding 

procedure  as  input  to  be  refineo.  With  a  growing  amount  of  computer  power  a  first  rouqh 
approximation  of  an  object’s  contour  may  be  refined  step  by  step  to  yield  a  precise 
approx imation. 

3.  SPECIALIZED  EXTRACTION  METf’ODS 

3.1.  Extraction  of  line  shaped  objects 

3.1.1.  Incremental  method 

The  incremental  method  basically  proceeds  as  follows: 

-  suppose  that  two  points  pc  and  p.,  of  the  centreline  of  a  line  object  have 
already  been  found  (see  fiqure  1) 

-  connect  pQ  and  p<  by  a  straiqht  line 

-  predict  the  location  of  the  next  point  p2  of  the  object's  centreline  at  a 
certain  step  width  beyond 

-  compute  the  coordinates  of  all  points  of  a  semi  circle  through  p2 

-  compile  the  grey  levels  of  all  of  these  points  into  a  qrey  level  diagram  which 
shows  a  display  of  the  cross  section  of  the  qrey  level  situation  at  location 
pa  (see  fiqure  lb) 

-  analyze  the  qrey  level  diagram  to  discover  features  of  the  line  object 

-  compare  the  candidate  features  with  the  actual  features  of  the  points  p,  and 
p*  which  have  been  confirmed. 

-  in  case  of  compliance  decide  on  the  true  location  of  p2 ,  adapt  the  features 
and  go  on. 

Locally  guided  by  the  results  of  previous  steps  this  method  proceeds  incrementally  alonq 
the  object's  centreline.  All  object  points  which  have  been  detected  and  accepted  by  the 
algorithm  are  displayed  immediately  for  visual  control  by  the  human  operator.  Durinq  the 
object  extraction  the  procedure's  parameters  such  as  width  of  the  object,  averaqe  qrey 
value  of  the  object,  averaqe  contrast  with  the  neighbourhood,  and  step  width  are 
continuously  updated  to  the  actual  values  to  adapt  possible  chanqes.  The  procedure  has  be 
completed  by  the  grey  level  analysis  of  several  concentric  full  circles  around  the  actual 
location  for  two  reasons  mainly  (flroch,  1980): 

-  to  overcome  local  distortions  of  the  object  which  have  been  detected  by  the 
fact,  that  no  normal  continuation  of  the  object  could  be  found  while  analysing 
the  semi-circle  as  mentioned  above 

-  to  evaluate  the  grey  level  situation  completely  in  the  case  of  two  or  more  line 
objects  crossinq  each  other  which  could  have  been  detected  by  the  fact  the  more 
than  one  normal  continuation  of  the  present  object  was  found  while  analysing 
the  semi-circle. 

3.1.2.  Area  slicing  method 

The  area  slicinq  method  works  in  a  predetermined  limited  part  of  the  image,  called 
"area  of  interest"  which  usually  has  a  rectanqular  shape.  The  area  of  interest  is 
expected  to  circumscribe  a  interesting  part  of  the  line  object  completely  (see  fig. 2). 
At  the  same  time  the  orientation  of  the  rectangle  is  supposed  to  comoly  with  the 
predominant  orientation  of  the  object.  It  has  been  found  that  a  complete  analysis  of  the 
2  D  grey  level  matrix  of  the  area  of  interest  is  not  necessary  to  discover  the  continue- 
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tion  of  a  line  shaoed  object,  but  that  the  analysis  of  1  D  sample  lines  which  contain 
cross  sections  of  the  object,  is  sufficient.  The  sample  lines  are  perpendicular  to  the 
main  axis  of  the  rectangle.  The  results  of  the  analysis  of  the  sample  line’s  qrey  level 
functions  are  combined  in  order  to  form  sets  of  coll  inearl v  connected  points  (Rausch 
et.  al. ,  1970  a);  at  the  end  these  sets  of  collinear  points  are  linked  to  form  lines. 


3.2  Extraction  of  reqion  like  objects 

3.2.1  The  binarization  method 

Certain  region  objects  which  are  characterized  by  a  rather  uniform  overall  grey  level 
and  a  high  contrast  with  their  surround inqs ,  such  as  a  dark  forest  or  a  lake,  may  be 
extracted  by  binarization.  For  this  purpose,  an  area  of  interest  is  circumscribed  to  the 
object.  Within  the  area  of  interest  a  global  evaluation  of  qrey  levels  is  applied  to 
determine  a  threshold  which  will,  most  likely,  seperate  object  pixels  from  non  object 
pixels.  Thus,  binarizinq  of  the  area  of  interest  results  in  several  subsets  of  object 
pixels,  non  object  pixels  and  neutral  pixels  of  a  reject  area,  which  cannot  be  assiqned 
without  doubt  neither  to  the  object  nor  to  the  non  object  area  in  this  processing  step. 
The  reject  area  nay  be  subjected  to  a  postprocessing  algorithm  which  decides  on  the 
of  a  local  majority  aqainst  or  in  favour  of  the  object  pixel's  subset.  The  last  step 
consists  of  a  merginq  of  all  object  pixel  subsets  and  the  detection  of  its  border. 

3.2.2  The  radius  method 

The  method  aims  at  the  detection  of  a  few  sample  points  of  the  border  line  (contour)  of 
the  region  object.  The  samples  are  connected  by  straiqht  line  segments  and  are  considered 
to  be  a  rough  approximation  of  the  true  contour  of  the  object.  To  detect  the  sample 
points,  grey  level  functions  are  analyzed,  which  are  assembled  from  clockwise  arranged 
radii  oriqinatinq  at  a  central  point  within  the  region  object.  The  intersections  of  the 
radii  with  the  contour  are  detected  as  characteristic  chanqes  of  the  qrey  level  function 
(edqe  detection,  see  also  section  4  and  fiq.  3a  and  b) . 

3.2.3  The  modified  area  slicinq  method 

To  refine  the  rough  approximation  detected  by  the  radius  method  a  modified  version  of 
the  area  slicinq  method  mav  be  applied  to  each  seqment  of  the  approximation  polyqon 
(Rausch,  U.  et.  al.,  1979  h) .  Again  a  number  of  qrey  level  functions,  perpendicular  to  the 
orientation  of  the  respective  polyqon  seqment,  is  assembled,  cross  sections  of  the  true 
contour  which  should  he  contained  in  the  grey  level  functions,  are  predicted  and  verified 
via  an  edqe  detection  technique  and  the  local  collinearity  of  adjacent  sample  points  is 
evaluated  (see  fiq.  4). 

The  modification  of  the  method  refers  to  the  edqe  sensitivity  of  the  qrey  level  function 
analysis  which  contrasts  with  the  line  sensitivity  of  the  same  analysis  within  the 
original  method;  both  are  discussed  in  section  4. 


3.3  Extraction  of  point  like  objects 

One  method  has  been  developed  to  extract  point  objects.  Remember  that  "point  objects" 
must  not  consist  of  a  single  pixel;  they  should  comprise  a  few  pixels  which  cover  a 
small,  but  noticeable  area:  they  differ  from  "region  objects"  by  their  small  size  and 
their  conroact  form.  The  method  proceeds  in  several  steps  and  works  only  in  a  limited  area 
of  the  image,  with  the  results  of  the  line  and  reqion  objects  extraction  we  confine  the 
search  for  specific  point  objects  to  promising  areas  such  as: 

-  vehicles  are  assumed  to  appear  only  on  or  near  any  kind  of  a  rood 

-  houses  and  other  buildinqs  are  expected  to  exist  in  urban  areas  or 
in  the  neighbourhood  of  roads  only. 

The  process  beqins  with  the  analysis  of  onedimensional  qrey  level  diaqrams  of  two 
test  lines,  one  perpendicular  to  the  other  (see  fiq.  5).  If  both  diaqrams  contain 
features  which  indicate  the  existence  of  a  point  object  considering  its  size,  shape 
and  orientation,  a  cue  is  assiqned  to  that  location  and  the  verification  is  initiated. 
Verification  consists  of  the  analysis  of  the  qrey  values  of  a  twodimensional  submatrix 
circurtscr ibing  the  cue  area.  At  first  an  attempt  is  made  to  seperate  the  supposed  object 
from  its  neighbourhood  by  binarization  technique.  The  threshold  is  derived  from  the 
previously  successful  diaqram  analysis.  Tfie  resultinq  set  of  object  pixels  is  subjected 
to  a  classification  alqorithm,  with  takes  into  account  certain  shape  features  such  as 
lenqth/width  ratio,  area/perimeter  ratio,  absolute  size,  etc. 

4 .  Analysis  of  1  D  qrey  level  functions 

The  analysis  of  1  D  qrey  level  functions,  the  size  and  the  location  of  which  can  be 
determined  by  previous  results  of  the  extraction  process,  is  a  common  feature  of  all 
object  extraction  methods  mentioned  above.  The  analysis  alqorithm  is  different  for  the 
different  object  types  and  is  outlined  as  follows. 
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4.1  Grey  level  profile  of  a  line  object 

The  cross  section  of  an  undistorted  line  object  (such  as  a  clearly  visible  part  of  a 
road  or  a  river)  in  a  diqitized  aerial  inaqe  shows  a  characteristic  profile  in  a  qrey 
level  function:  a  briqht  road  in  dark  surroundinqs  will  result  in  a  positive  peak 
(location  A  in  fiq.  6  a);  the  dark  water  of  a  river  which  contrasts  ideally  to  a  briqht 

surroundinq,  will  lead  to  a  neqative  peak  of  the  function  (location  R  in  fiq.  6  a) 

In  real  world  imaqes  the  object  profiles  in  qrey  level  functions  will  not  have  such  an 
ideal  form,  but  will  be  deqraded  by  various  distortions.  Thus,  the  analysis  of  a  qrey 
level  function  starts  with  the  assumption  that  the  profile  of  a  line  object  is  contained. 
To  prove  this  assumption,  a  value  C  is  computed  for  each  pixel  of  the  qrey  level 

function.  The  amount  of  C  stands  for  the  compliance  of  the  actual  limited  qrey  level 

function  at  that  location  with  the  assumption.  The  value  C  is  computed  by  arithmetic 
combination  of  the  absolute  qrey  level  differences  *0^  and  aGj  with  the  “width"  aw 

of  the  profile  (see  Fiq.  6  b) .  The  number  of  pixels  to  the  left  and  to  the  riqht  of  the 

current  location  X  is  variable  up  to  an  upper  limit  in  order  to  maximize  the  absolute 
value  of  and  a G2  ;  the  upper  limit  of  the  variable  test  distance  is  determined  via 
a  priori  knowledqe  about  the  type  of  object  to  be  detected  and  the  scale  of  the  imaqe . 
In  fiq.  7  an  actual  qrey  level  function  and  the  function  of  the  profile  assessment  values 
C  are  shown.  It  will  be  noted,  that  the  values  C  of  the  peak  parts  of  the  qrey  level 

function  rank  hiqhest,  whereas  for  uniform  (i.e.  featureless)  parts  of  the  qrey  level 

function  the  values  C  tend  to  be  zero. 

4.2  Grey  level  profile  of  reqion  object  contours 

The  cross  section  of  a  reqion  object's  contour  shows,  in  an  ideal  case,  characteristic 
features  (see  fiq.  fl).  In  real  imaqes  such  an  ideal  profile  is  often  deqraded  by 
specific  structures  of  the  object,  by  variable  contrasts  and  by  noise. 

To  test  aqain  the  assumption,  that  a  cross  section  of  a  reqion  objet's  contour  is 
contained  in  the  qrey  level  function,  a  modified  value  of  compliance  Cm  is  computed  for 
each  pixel  of  the  qrey  level  function.  The  value  Cm  results  from  an  arithmetic 
combination  of  the  followinq  local  features  of  the  qrey  level  function  (see  fiq.  8): 

-  the  qrey  level  difference  aO  measured  in  section  M  of  the  function, 
where  X  is  in  the  centre  of  M 

-  the  lack  of  qrey  level  differences  EL  and  E„  ("evenness")  measured 
in  the  sections  L  and  R  of  the  function 

-  the  difference  of  qrey  level  variances  measured  in  the  sections 
L  and  R  of  the  function. 

The  widths  of  the  sections  M,  L  and  R  is  variable  with  an  upper  limit,  which  is  determine 
by  a  priori  knowledqe  about  the  qbjects  to  be  extracted  and  the  scale  of  the  imaqe.  The 
actual  width  of  M  is  chosen  to  maximize  & G,  whereas  the  actual  width  of  I,  and  R  is 
chosen  to  minimize  the  respective  qrey  level  differences,  i.e.  maximize  the  evenness  of 
the  function  within  these  sections. 

The  operation  may  be  interpreted  as  superimposinq  a  qauqe  of  an  ideal  "step  edqe"  of 
variable  but  limited  size  on  each  location  of  the  actual  qrey  level  and  computinq 
a  value  of  compliance  between  this  qauqe  and  the  local  function  values.  In 

fiq.  9  an  example  of  a  qrey  level  function  of  a  real  imaqe  and  the  computed  values  C*, 
are  shown. 


5.  Initialization  of  the  methods 

The  initialization  of  the  methods  described  consists  of  two  main  parts: 

-  input  of  feature  values,  parameter  values,  thresholds  etc. 

-  .definition  of  startinq  locations,  from  where  a  continuation  of  the 
object  or  the  object  contour  can  be  traced 

To  solve  the  first  problem  a  set  of  default  values  has  been  compiled  in  numerous  test. 
These  values  have  proven  to  be  useful  for  a  variety  of  applications  and  can  be  chanqed 
by  operator  interaction  if  necessary. 

To  solve  the  second  problem,  the  most  simple  and  most  flexible,  but  not  the  fastest 
method  consists  of  a  visual  identification  of  suitable  startinq  locations  by  the  operator 
with  manual  input  of  the  respective  coordinates.  To  support  the  operator  at  this  task 
an  automatic  method  has  been  developed  which  will  be  described  for  the  example  of  roads. 

A  startinq  location  for  the  road  extraction  method  is  defined  as  a  clearly  visible, 
unsdistorted,  small  section  of  a  road.  Such  a  section  of  a  road  is  accepted  only  if  it 
satisfies  certain  conditions  which  are  tested  sequentially.  At  first  a  qrev  level 
function  analysis  is  initiated  alonq  several  equally  spaced  horizontal  and  vertical 
imaqe  lines.  Each  location  where  the  profile  of  a  line  object  has  been  detected,  is 
marked  as  a  cue  for  a  detained  investiqation.  This  investiqation  tries  to  verify  a  cue 
by  a  full  analysis  of  the  qrey  level  functions  of  two  concentric  circles  around  the  cue 
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location.  The  cue  is  accepted  as  a  startinq  location  for  road  extraction  onlv  if  the 
followinq  conditions  are  satisfied: 

-  two  and  not  nore  than  two  cross  sections  of  a  line  object  were  detected  at 
the  inner  circle 

-  two  and  only  two  cross  sections  of  a  line  object  were  detected  at  the  outer 
circle 

-  all  cross  sections  detected  and  the  cue  itself  are  collinear 

-  the  grey  level  variation  of  all  pixels  between  the  far  most  cross  sections 
detected  does  not  exceed  a  specified  limit. 

For  each  startinq  location  which  has  been  accepted,  the  location  coordinates  and  its 
orientation  are  passed  to  the  extraction  method  to  enable  the  first  prediction  of  the 
continuation  (flroch,  1979). 

6.  RESULTS 

All  methods  have  been  implemented  on  a  computer  PRC  PDP  11/70.  Due  to  the  limitations 
of  the  computer,  all  proqrams  could  not  be  run  as  a  sinqle  task;  the  different  tasks 
which  have  to  be  installed,  interface  with  each  other  via  data  files  on  magnetic  disk. 

The  methods  were  tested  with  aerial  imagery  of  scales  from  1  :  10000  to  1  :  100000. 
Subinaqes  of  a  size  of  6  cm  x  6  cm  were  diqitized  using  a  DICOMRD  device.  The  snacial 
resolution  was  adapted  to  the  scale  of  each  imaqe,  that  each  pixel  of  the  image  matrix 
represented  a  circle  area  of  2  m  to  5  m  in  diameter  on  the  earth,  '"he  followinq  examples 
are  reproductions  of  displays  from  a  COMTAL  screen. 

Fiqure  10  shows  results  of  extracted  roads  with  the  incremental  method.  Onlv  one  startinq 
location  on  the  horizontal  part  of  the  road  at  the  right  side  was  provided  and  no 
interactive  correction  or  postprocessing  was  necessary. 

Fiqure  11  qives  an  example  of  the  extraction  of  a  variety  of  line  objects.  The  result 
was  produced  by  the  combined  application  of  the  incremental  and  the  area  slicing  method. 
Both  methods  complement  each  other  very  well:  while  the  incremental  procedure  has 
advantages  at  curved  lines  and  at  the  detection  of  all  branches  of  an  intersection,  the 
area  slicinq  method  proves  superiority  at  distorted  sections,  occluded  parts  and 
variable  features  of  the  objects. 

In  fiqure  12  an  example  of  the  extraction  of  a  forest  reqion  is  shown.  A  first 
aporox imation  of  the  forest's  contour  was  produced  by  the  radius  method  and  was  used  to 
quide  the  modified  area  slicinq  method.  The  resulting  precise  approximation  of  the 
contour  still  has  a  few  qaps  which  could  be  filled  by  a  postprocessing  operation.  The 
last  example  is  shown  in  fiqure  13,  which  demonstrates  the  method  for  extraction  of  point 
objects.  The  display  consists  of  roads  and  surroundinq  strips  in  an  urban  area.  All  the 
cues  for  vehicles  which  have  been  detected  during  the  first  step  of  the  procedure  are 
marked  with  a  cross  and  those  object  which  were  finally  accepted,  are  marked  with  small 
rectangles.  Some  imperfections  of  the  performance  regarding  especially  cars  parked  at 
the  border  of  the  road  may  need  a  refinement. 

7.  CONCLUSIONS 

In  conclusion  the  followinq  facts  be  stated  as  a  support  of  the  solutions  explained 
above: 

-  interaction  of  a  human  operator  which  still  proves  to  be  necessary,  has  been 
concentrated  to  the  initialization  of  the  methods  and  the  correction  and 
completion  of  results 

-  due  to  the  reduction  of  image  data  to  be  processed,  fast  computing  can  he 
realized.  An  assessment  has  shown,  that  with  a  modern  computer  system  the 
extraction  of  a  complete  network  of  roads  from  an  image  of  an  average 

2  km  x  2  km  section  of  a  civilized  country  needs  not  more  than  10  sec  of 
processing  time  (without  data  input  and  output  and  without  interaction) 

-  the  results  are  reliable  althouqh  not  always  complete.  But  the  interactive 
completion  proves  to  be  easier  than  the  correction  of  false  results. 

The  adaption  of  the  methods  to  other  sensor  data  (f.e.  RADAR  data)  and  the  exploitation 
of  map  data  as  an  input  to  cartographic  information  systems  is  planned. 
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Figure  Is 


Basic  proceeding  of  the  incremental  method; 

a)  prediction  of  point  P*  ;  b)  grey  level  function  of  semi  circle  K 


Characteristics  of  the  area  slicing  method; 

a)  area  of  interest  superimposed  to  the  image;  b)  area  of  interest 
consisting  of  sample  lines;  c)  locations  with  a  high  value  of 
compliance  C 


Figure  2: 
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Fiaure  3: 


Basic  idea  of  the  radius  method; 
a)  radii  startinq  at  central  point  P0 
radii  and  R2 


b)  qrey  level  functions  for 


Fiqure  4s 


The  modified  area  slicinq  method; 

al  area  of  interest  containinq  a  part  of  a  reqion  s 
lines  with  marked  locations  of  hiqh  values  Cm  ;  c) 


contour;  b)  sample 
qrey  level  function  of 


first  sample  line 


Figure  5:  To  explain  the  extraction  of  point  objects; 

a)  test  lines  1  and  q;  b)  grey  level  functions  of  test  lines 


a  b 


Analysis  of  grey  level  profiles  of  line  objects; 

a)  idealized  profiles  of  line  objects;  b)  features  of  the  profile  at 
location  X 


Figure  6: 
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Figure  9:  Demonstration  of  the  contour  detection  algorithm 


Figure  10 


Result  of  the  extraction  of  roads  by  the  incremental 
method 
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(a) 


(b) 


(c) 
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Result  of  the  extraction  of  vehicles; 

a)  unprocessed  parts  of  an  imaqe;  b)  cues  found  in  the  search  areas 
c)  accepted  vehicle  locations 
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SUMMARY 

An  adaptive  spatial  filter  has  been  implemented  in  digital  hardware  and  a  number  of  subsequent  proc¬ 
essing  steps  have  been  developed  including  static  clutter  cancellation  and  tracking.  The  hardware  is 
part  of  an  experimental  ground  based  infra-red  system  for  low  level  air  surveillance. 

The  spatial  filter  algorithm  is  based  on  using  the  intensities  recorded  in  a  small  window  to  detect 
locally  significant  peaks.  The  effect  of  noise,  quantisation  and  mismatch  between  channels  is  discussed. 
Variation  in  target  subtense,  target  registration  and  inter-channel  gaps  affect  detection  probability  and 
this  also  has  been  explored  by  computer  simulation. 

The  filter  produces  multiple  detections  on  a  single  peak  and  therefore  an  additional  processing 
stage  was  devised  to  reduce  multiple  detections  to  single  detections.  Stationary  alarms  are  removed  by 
a  process  of  static  clutter  cancellation  and  remaining  alarms  are  handled  by  track  forming  using  a  micro¬ 
processor.  Angular  rate  limits  are  applied  to  tracks  and  established  tracks,  formed  within  rate  limits, 
are  confirmed  as  moving  targets. 

SYMBOLS 

C  algorithm  1  threshold  parameter 

K  algorithm  2  threshold  parameter 

xn  co-ordinates  of  an  alarm  on  the  n  scan 

pxn  predicted  position  of  an  alarm  on  the  nc^  scan 

Ax  prediction  window  half  width 

CE  central  pixel  recorded  amplitude 

V^  pixel  element  i  recorded  amplitude 


1.  INTRODUCTION 

This  work  has  been  carried  out  in  the  context  of  the  feasibility  of  passive,  automatic  infra-red 
surveillance  for  short  range  surface  to  air  guided  weapons.  The  emphasis  of  the  work  is  towards  achieving 
around  10  km  detection  range  against  fighter  ground  attack  aircraft  at  a  operating  false  alarm  rate  of 
2-3  per  hour. 

Initial  work  was  aimed  at  providing  a  raw  data  base  for  targets  and  backgrounds  but  the  scope  of 
work  was  subsequently  widened  to  provide  an  experimental  hardware  implementation.  An  experimental  panor¬ 
amic  surveillance  equipment  operating  in  the  8-13pm  waveband  was  procured  from  EMIE  at  Feltham.  Con¬ 
current  with  the  sensor  procurement  Plessey  at  Havant  were  contracted  to  investigate  signal  processing 
schemes,  in  particular  digital  spatial  filter  algorithms,  and  to  produce  a  hardware  implementation  of  the 
most  promising  spatial  filter  algorithm  studied.  The  filter  hardware  was  interfaced  to  the  EMIE  scanner 
and  the  RSG9  trial  at  Taranto  1978  was  the  first  opportunity  to  demonstrate  the  techniques  evolved  and 
to  gather  data  suitable  for  further  investigation.  As  a  result  of  further  work  additional  processing 
hardware  has  been  built  to  complete  the  system  as  described. 

1 . 1  The  Sensor 

Fig  1  shows  a  functional  block  diagram  of  the  equipment.  The  sensor  provides  image  quality  data 
over  a  coverage  of  360°  x  6.8°  at  a  revolution  rate  of  twice  per  second.  The  elevation  coverage  of  the 
equipment  is  achieved  in  a  single  swathe  per  scan  using  a  192  element  detector  at  a  geometrical  resolu¬ 
tion  of  O.SmR.  The  figure  shows  that  two  outputs  of  the  head  are  used.  The  whole  panoramic  coverage  is 
digitised  to  8  bit  resolution  and  the  resulting  real  time  digital  video  made  available  to  the  signal  pro¬ 
cessing.  Secondly,  a  sector  of  the  panorama  covering  13.4°  x  6.8°  is  stored  to  6  bit  resolution  for  TV 
display.  The  bearing  of  the  sector  being  /electable  over  360°. 

The  sensor  head  achieves  panoramic  coverage  by  rotating  a  45°  mirror  in  a  periscope  structure.  The 
detector  with  its  associated  cooling  system  is  fixed  and  a  stationary  image  is  presented  to  the  image 
plane  by  means  of  a  sophisticated  derotation  system.  The  detector  output  signals  are  AC  coupled  to  pre¬ 
amplifiers  and  after  band  shaping, amplification  and  multiplexing  are  converted  to  8  bit  samples  for  sig¬ 
nal  processing.  Each  channel  is  sampled  at  0.49  milli  radian  intervals  in  azimuth  giving  a  data  rate  of 
4.9  x  10^  pixels  per  second. 

1.2  The  Spatial  Filter 

With  a  8-13pm  broadband  surveillance  sensor,  spatial  filtering  provides  the  prime  technique  whereby 
an  infra-red  target  can  be  distinguished  from  the  majority  of  clutter  background  detail.  The  statistics 
of  1R  background  signals  are  strongly  non-stationary  and  attempts  to  design  a  single  optimum  filter  are 
doomed  to  failure  since  the  radiance  distributions  change  markedly  with  elevation  (ground,  horizon,  sky) 
azimuth  angle  (buildings  clustered  over  small  sectors)  and  time  (solar  heating,  deployment  of  battlefield 
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paraphernalia).  What  is  needed  is  an  adaptive  filter  capable  of  adjusting  its  threshold  decision  as  the 
need  arises.  The  spatial  filter  algorithm  used  is  based  on  using  the  intensities  recorded  in  a  snail 
window  to  detect  locally  significant  peaks.  The  window  is  a  3  x  3  pixel  grid,  which  at  its  most  compact 
is  a  sparse  arrangement  within  a  3  x  5  rectangle.  The  detection  criteria  are  that  the  central  pixel 
amplitude  shall  exceed  the  mean  of  the  other  grid  elements  by  a  preset  value  C  and  exceed  the  mean  plus 
a  preset  fraction  K  of  the  difference  between  the  maximum  and  minimum  element  values.  These  thresholds 
are  calculated  and  refreshed  at  each  and  every  pixel  during  the  surveillance  scan.  In  this  way,  the 
decision  threshold  instantaneously  applied  to  each  pixel  in  turn  varies  with  the  clutter  content  in  the 
neighbourhood  of  the  pixel  and  by  choosing  both  the  overall  size  of  the  window  and  its  element  sizes, 
small  hotspots  including  aircraft  targets  are  extracted.  Relatively  wide  angle  detail  (cloud  edges, 
masts  etc)  or  adjacent  conglomerations  of  hotspots  are  rejected.  The  output  of  the  spatial  filter  con¬ 
sists  essentially  of  the  co-ordinates  of  detected  hotspots,  but  other  data  used  in  the  filter  including 
amplitude  is  also  available. 

Being  a  passive  broadband  sensor  this  information  is  not  supplemented  by  either  range  or  velocity 

data. 


1.3  Subsequent  Processing  Stages 

Recorded  output  from  the  spatial  filter  formed  a  data  base  for  computer  simulation  of  subsequent 
processing  stages  which  have  now  been  implemented,  and  figures  quoted  for  data  reduction  by  subsequent 
stages  are  those  obtained  from  simulations  on  an  extensive  data  base.  It  was  originally  conceived  that 
the  processing  of  alarms  should  be  handled  by  a  single  microprocessor  producing  tracks  and  flagging  those 
tracks  which  were  target  like.  This  leads  to  the  need  for  a  degree  of  algorithmic  sophistication  which 
meant  that  the  processor  used  was  not  capable  of  handling  the  data  rate.  By  adding  intermediate  process¬ 
ing  stages  it  was  possible  to  reduce  the  load  on  the  processor  and  simplify  the  tracking  algorithms  to 
allow  for  an  increased  data  rate  and  possibly  some  enhancements  to  the  basic  tracking  concept. 

2.  ANALYSIS  AND  RESULTS 

2.1  Spatial  Filter 

The  general  configuration  of  the  spatial  filter  is  shown  in  Fig  2.  The  algorithm  was  chosen  because 

of  che  simplicity  with  which  it  can  be  implemented  digitally.  In  the  simple  case  where  signal  energy  is 
concentrated  in  the  central  element  and  the  surrounding  8  pixels  are  each  independent  samples  of  the 
same  gaussian  noise  process  the  probability  of  detection  and  false  alarm  on  noise  is  analytic  (Appendix 
I).  The  two  thresholds  calculated  by  the  filter  are  distributed  with  mean  values  of  C  and  2.845K  re¬ 
ferred  to  rms  noise.  The  distribution  of  the  second  threshold  is  skew  with  long  tails  which  means  both 

a  high  probability  of  false  alarm  on  noise  and  a  high  probability  of  Betting  a  high  threshold  on  signal, 

neither  of  which  is  particularly  desirable.  The  distribution  of  the  first  threshold  is  a  narrow  gaussian 
and  was  introduced  as  a  'catch'  to  reduce  the  false  alarm  rate  on  noise  (fig  3)  which  results  from  the 

other  threshold.  The  practical  situation  does  not  lend  itself  so  readily  to  analysis.  The  8  pixels 

used  in  the  algorithm  to  derive  a  threshold  which  signal  must  exceed  are  taken  from  3  separate  detectors. 
Although  these  are  approximately  matched  by  the  manufacturer,  there  is  none  the  less  a  spread  in  their 
detectivities  and  responsivities  and  a  spread  in  the  matching  of  the  gains  and  offsets  of  their  respec¬ 
tive  preamplifiers  caused  by  misalignment, ageing  and  thermal  drift.  The  noise  processes  which  occur 
within  the  dewar  containing  the  detectors  combine  in  such  a  way  that  the  assumption  of  independence  be¬ 
tween  samples  is  questionable.  The  filter  hardware  uses  discrete  digitised  samples  to  perform  its 
arithmetic  and  hence  Che  dynamic  range  of  signals  is  limited.  A  Monte  Carlo  model  approach  was  adopted 
in  an  attempt  to  assess  the  probable  effect  of  mismatches  and  to  see  at  what  level  of  quantisation  the 
discrete  solution  differed  from  the  continuous  analytic  solution.  For  quantisation  levels  less  than  rms 
noise  there  is  little  difference  between  the  discrete  and  continuous  performance  but  beyond  that  the  K 
threshold  ceases  to  be  significant  and  hence  the  ability  of  the  threshold  to  effectively  adapt  to  back¬ 
ground  detail.  Taking  account  of  this  and  the  maximum  quantisation  error  in  a  signal  it  is  undesirable 
to  quantise  more  coarsely  than  around  1  level  for  rms  noise.  The  effects  of  the  degradation  processes 
in  the  real  situation  are  an  increase  in  false  alarm  rate  and  a  reduction  in  target  detection  probability 
as  one  might  expect.  See  Fig  4.  There  is  an  increasing  probability  that  a  large  signal  will  not  be 
detected,  as  the  system  is  progressively  degraded  by  mismatching,  this  relates  physically  to  the  banding 
which  is  seen  with  parallel  scan  thermal  images  and  may,  to  some  extent,  be  compensated  for  by  varying 
the  parameters  K  and  C  dynamically.  This  has  been  allowed  for  in  the  implementation  and  K  and  C  may  be 
altered  at  video  rates. 

The  spatial  filter  in  the  absence  of  noise  is  capable  of  discriminating  against  those  features  which 
would  give  rise  to  an  alarm  with  a  simple  one  dimensional  filter,  see  fig  5,  and  in  favour  of  point  tar¬ 
gets.  The  subtense  of  a  true  target  however  depends  on  its  range,  size  and  aspect  and  is  only  unresolved 

at  the  furthest  range  at  which  the  sensor  is  capable  of  detecting  it.  Figure  6  illustrates  the  spatial 

properties  of  just  two  possible  windows  against  a  square  target.  The  detector  output  signals  are  cal¬ 
culated  by  convolving  the  target  with  the  optics  point  spread  function  and  the  detector,  and  the  sig¬ 
nals  presented  to  the  filter  are  calculated  by  convolving  the  detector  output  signals  with  the  impulse 
response  of  the  electronics.  To  obtain  the  probability  of  detection  the  square  target  is  taken  to  be 
centred  with  equal  probability  anywhere  within  a  rectangle,  centred  on  the  central  element  of  the  window, 
0.49  mR  wide  by  0.623  mR  high,  to  account  for  registration  losses.  The  size  of  the  rectangle  is  fixed 

by  the  sampling  rate  and  the  detector  pitch.  As  target  subtense  increases  it  spreads  into  the  8  pixels 

surrounding  the  central  element  increasing  their  mean  value  and  hence  raising  the  detection  threshold. 
Because  the  target  may  not  register  perfectly  with  the  sampling  and  scanning  the  spread  is  preferential 
to  one  or  other  element,  this  also  raises  the  detection  threshold  depending  on  the  value  of  K.  The 
first  window  is  not  a  suitable  choice,  K  must  be  near  one  at  least  to  discriminate  against  simple  features 
and  that  value  leads  to  poor  target  detection  performance.  Window  2,  however,  has  good  performance  for 
targets  less  than  1.5  mR  subtense  and  discriminates  well  against  targets  of  larger  subtense.  In  the  case 
of  small  signals  and  in  the  presence  of  noise  both  thresholds  must  be  considered  as  the  target  size  is 
increased.  For  small  targets  misregistration  between  target  and  signal  sample  leads  to  a  reduced  central 
element  value,  and  spreading  of  larger  targets  leads  to  an  increase  in  the  mean.  These  two  mechanisms 
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increase  the  effective  value  of  C,  see  Fig  7.  The  effect  of  large  target  subtense  in  K  has  been  accounted 
for  in  Fig  6  but  the  effect  of  small  target  loss  has  not,  it  is  in  fact  simply  a  reduction  in  signal  to 
noise  ratio. 

2.2  Multiple  Alarm  Cancellation 

Because  the  spatial  filter  uses  a  sparse  matrix  of  picture  elements  but  computes  for  every  picture 
element  there  is  a  possibility  that  a  point  or  slightly  extended  source  will  generate  more  than  one  alarm, 
the  number  of  alarms  possible  from  a  single  source  increases  as  the  window  size  and  as  the  source  size 
increases  within  the  limits  of  the  window  size.  In  fact  for  window  2  around  30%  of  all  clutter  alarms 
consist  of  clusters  of  alarms.  Aircraft  target  alarms  are  seemingly  more  likely  to  be  multiples.  From 
an  analysis  of  the  RSG9  Taranto  trial  results  60%  of  aircraft  detections  at  10  km  range  consisted  of  more 
than  one  alarm  and  95%  at  5  km  consisted  of  more  than  one  alarm. 

These  alarms  have  more  than  a  nuisance  value  in  increasing  the  loading  on  the  tracking  process,  they 
lead  to  the  creation  of  more  than  one  trackfile  for  what  is  essentially  the  same  track.  This  leads  in 
turn  to  conflicts  in  associating  new  plots  with  existing  tracks  and  an  unacceptable  increase  in  the  target 
track  confirmation  time.  There  is  no  penalty  in  reducing  connected  clusters  of  alarms  to  single  alarms. 

Two  algorithms  have  been  studied.  The  first  is  based  on  rejecting  all  but  the  alarm  with  the  largest 
amplitude  of  a  set  of  connected  alarms.  Alarms  are  reckoned  to  be  connected  if  their  co-ordinates  are 
within  a  predetermined  range  of  each  other,  say  2.5  milli  radians.  The  advantage  of  this  algorithm  with 
targets  is  that  it  tends  to  determine  the  thermal  centroid  and  it  may  reduce  the  number  of  clutter  false 
alarms  generated  by  objects  in  the  foreground  such  as  wire  stays  which  a  sparse  filter  matrix  does  not 
effectively  reject.  The  disadvantage  of  the  algorithm  is  that  it  is  not  easily  implemented.  The  second 
algorithm  is  to  take  the  first  alarm  detected  and  inhibit  further  alarms  within  a  rectangular  window  of 
which  the  first  alarm  is  the  corner.  This  algorithm  is  easily  implemented. 

A  comparison  showed  that  the  first  algorithm  was  only  marginally  more  effective  than  the  latter  and 
that  a  window  of  5  x  5  pixels  in  the  second  multiple  cancelling  algorithm  is  near  to  optimum.  This  has 
now  been  translated  into  hardware. 

2.3  Static  Clutter  Cancellation 

Having  removed  multiple  alarms  from  the  output  of  the  spatial  filter  there  are  inevitably  other 
alarms  originating  from  detail  in  the  thermal  scene  which  are  essentially  stationary.  If  trackfiles  are 
set  up  to  track  these  stationary  targets  then  an  unnecessary  load  is  placed  on  the  processor.  They  do  in 
fact  warrant  a  separate  process  which  has  been  implemented  after  multiple  alarm  cancellation.  The  mechan¬ 
ism  involves  the  storage  of  a  map  of  alarms  and  correlating  incoming  alarms  with  the  map  and  at  the  same 
time  updating  the  map.  The  reason  for  implementing  this  process  after  multiple  alarm  cancellation  is  that 
it  reduces  the  amount  of  storage  needed  for  the  map.  Correlation  is  carried  out  to  a  lower  precision  than 
the  angular  data  rate,  this  is  to  take  care  of  scan  to  scan  angular  jitter  from  sensor  registration  noise, 
wind  blown  clutter  and  apparent  movement  of  highlights  on  a  geometrically  extended  source. 

The  static  clutter  cancellation  works  as  follows.  The  coordinates  of  all  the  alarms  received  on  the 
first  scan  are  stored  sequentially.  A  number  is  also  stored  with  each  alarm  denoting  the  number  of  scans 
over  which  that  alarm  has  been  seen,  this  is  called  the  clutter  number.  In  the  first  instance  this  would 
be  one.  On  the  second  and  subsequent  scans  alarms  received  are  compared  with  the  stored  alarms  and  new 
alarms  added  to  the  store,  alarms  already  in  the  store  from  previous  scans  have  their  clutter  number 
incremented  by  one,  up  to  a  predetermined  limit  and  stored  alarms  which  have  not  been  sensed  have  their 
clutter  number  decremented.  Predetermined  upper  and  lower  limits  decide  whether  the  alarm  is  passed  on 
to  the  tracking  computer  or  deleted  from  the  clutter  store. 

By  selecting  the  limits  and  increments  it  is  possible  to  adjust  the  number  of  times  that  an  alarm 
must  be  seen  before  it  is  inhibited  from  the  next  processing  stage  of  tracking  and  the  number  of  times 
that  a  previous  static  alarm  may  fail  to  be  seen  before  it  is  deleted  from  the  store.  The  reason  for 
these  criteria  are  so  that  targets  which  are  slow  moving  are  not  masked  nor  delayed  from  being  confirmed 
and  that  scintillating  clutter  is  not  continually  being  deleted  and  re-added  to  the  map. 

The  efficiency  of  such  a  scheme  will  depend  strongly  on  the  scene  being  surveyed  but  on  simultations 
using  data  gathered  at  RSG9  Taranto  trials  it  effected  between  a  50%  and  70%  reduction  in  the  number  of 
alarms.  The  choice  of  resolution  and  the  number  of  times  an  alarm  must  be  seen  before  it  is  inhibited  is 
dependent  on  the  slowest  targets  which  it  is  desirable  to  track,  around  1  milli  radian  per  second.  A 
window  size  of  1  mR  for  correlation  around  each  alarm  was  chosen  with  the  requirement  that  an  alarm  be 
seen  3  times  before  it  is  cancelled.  The  choice  of  parameters  is  not  critical,  halving  the  resolution 
yields  only  a  2%  improvement  in  cancellation  rate  and  increasing  the  time  to  delete  an  alarm  from  store 
from  2.5  seconds  to  25  seconds  yields  only  a  1%  improvement.  The  software  implementation  of  the  algor¬ 
ithm  uses  a  linked  data  structure  with  pointers  to  enable  rapid  and  efficient  correlation,  the  trans¬ 
lation  into  hardware  has  been  made  using  two  stores  and  associated  logic,  each  of  1000  alarm  capacity 
which  alternate  in  use  every  scan.  One  is  the  reference  map  which  is  read  from  and  the  other  the  updated 
map  which  is  written  into  from  the  old  map  and  newly  received  alarms.  All  new  alarms  and  those  not  yet 
inhibited  are  passed  to  the  microprocessor  for  track  forming. 

2.4  Microprocessor  Track  Forming 

Remaining  alarms  are  processed  by  a  microprocessor  to  form  tracks.  The  idea  is  to  associate  the 
angular  pattern  of  alarms  from  scan  to  scan  and  deliberately  search  for  movement  by  establishing  tracks 
from  scan  to  scan  plot  data.  As  each  track  is  updated,  confidence  in  deciding  that  the  ttack  belongs  to 
a  target  is  updated  according  to  preset  algorithms  and  at  some  point  in  the  processing  the  presence  of  a 
target  is  confirmed. 
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The  basic  approach  is  Co  escablish  a  trackfile  for  each  new,  unassociated  alarm  or  plot  found  in  each 
scan.  For  example  ac  the  end  of  a  scan  there  may  be  say  five  trackfiles,  on  the  next  scan  there  may  be 
ten  plots  present.  The  task  is  to  associate  the  existing  trackfiles  with  the  plots  present.  Association 
is  carried  out  by  storing  in  each  trackfile  a  predicted  track  position  for  the  next  scan  together  with  an 
association  window.  If  one  of  the  next  scan  plots  falls  within  the  prediction  window  of  a  trackfile  it 
will  be  associated  with  the  track  and  the  trackfile  updated.  The  way  in  which  the  trackfile  is  updated 
depends  on  the  results  of  the  association  process.  Eventually,  in  the  case  of  a  consistent  target  track, 
sufficient  confidence  is  accumulated  to  cause  track  confirmation  or  trackfiles  are  deleted  and  never  con¬ 
firm  because  of  lack  of  updating  and  lack  of  confidence.  As  the  trackfiles  become  regularly  updated, 
targets  can  be  classified  as  fast  or  slow  and  may  therefore  be  deleted  on  the  assumption  that  they  are 
new  fixed  clutter  which  has  not  been  inhibited  by  the  static  clutter  cancellation  hardware.  The  actual 
association  is  attempted  by  creating  a  shortlist  of  plots  which  fall  within  the  prediction  window  of  each 
trackfile.  If  no  plots  are  present  the  window  is  doubled  in  size  and  the  process  repeated.  If  there  are 
still  no  associations  found,  the  track  is  coasted  and  the  confidence  factor  decremented.  If  the  window 
size  has  not  been  widened  and  a  single  plot  is  found  in  the  shortlist,  this  represents  the  optimum  situ¬ 
ation  and  the  confidence  factor  is  incremented.  Table  1  sunmarises  the  rules  used  to  cover  all  eventual¬ 
ities  during  the  association  process.  Following  association  the  trackfile  is  updated  by  re-filing  the 
track  history  over  the  last  3  scans  and  re-computing  a  new  prediction  for  the  next  scan.  The  prediction 
rules  are: 


one  point  prediction  (new  trackfile) 
Pxn+1  ’  xn±Ax 
two  point  prediction 

P*n+1  "  2xn  -  xn-l  ±  ix/2 

three  point  prediction 

1)  track  not  coasted 

px  ,  ■  3x  -  3x  .  +  x  » 

r  n+1  n  n-1  n-2 

2)  track  coasted 

px  -  3x  -  3x  ,  +  x  , 

r  n+1  n  n-1  n-2 


Ax/6 

Ax/3 


r 


where  pxn+i  is  the  predicted  position  of  the  next  alarm  of  an  established  track.  Ax  the  half  width  of  the 
association  window  and  Xj,  is  the  position  on  the  nch  scan.  It  can  be  seen  that  the  prediction  window  size 
is  adapted  approximately  as  more  information  on  the  track  becomes  available.  The  initial  value  Ax 
(1  point  prediction)  is  set  so  that  targets  with  radial  rates  greater  than  3.5  degrees  per  second  are  not 
tracked. 


When  all  trackfiles  have  been  updated  for  each  alarm  not  used  in  the  association  and  updating  process 
a  new  trackfile  is  established.  Finally  all  the  trackfiles  are  examined  for  status  to  determine  whether 
the  track  may  be  confirmed  or  deleted.  For  a  track  to  be  confirmed  as  an  alarm  it  must  have  reached  a 
predetermined  confidence  level,  the  slow  target  flag  must  not  be  set  and  the  coast  flag  must  be  cleared. 
Tracks  are  deleted  if  they  have  to  be  coasted  for  two  scans  or  if  the  confidence  falls  to  zero.  In 
general  for  a  well  behaved  track  confirmation  will  occur  after  three  consecutive  detections  within  four 
scans.  This  is  essential  to  maintain  reasonable  target  confirmation  ranges.  A  slow  target  is  not 
deleted  so  long  as  it  continues  to  be  updated  but  unless  its  status  changes  it  is  never  confirmed. 

3.  CONCLUSIONS 


On  the  basis  of  measured  performance  of  the  spatial  filter  it  has  been  possible  to  classify  false 
alarms  into  distinct  types  and  postulate  means  for  eliminating  most  of  them.  The  key  process  is  the 
spatial  filter  itself  and  its  ability  to  select  target  structures.  The  massive  data  reduction  that 
spatial  filtering  produces  mAkes  the  problem  of  scan  to  scan  correlation  more  tractable  and  this  in  turn 
reduces  the  data  rate  to  manageable  properties  for  kinematic  track  processing.  Indeed  each  step  in  the 
pipeline  is  designed  to  reduce  the  load  on  the  next  more  sophisticated  step.  In  this  way  the  task  has 
been  realised  in  real  time  without  resorting  to  high  technology  or  very  high  cost  structures. 


APPENDIX  1. 


CALCULATION  OF  SPATIAL  FILTER  PERFORMANCE 


The  spatial  filter  uses  a  matrix  of  9  pixels,  a  central  element  CE  and  8  neighbours  Vi  ....  Vg. 
Assuming  that  each  pixel  value  is  an  independent  sample  of  gaussian  noise  with  unit  variance  and  zero 


mean 

are. 


and  Vi 


Vo  are  numbered  in  order  of  increasing  amplitude  the  two  algorithms  used  in  the  filter 


CF.  >  tt 


CE  > 


it 


(1) 


K(Vg  - 


V 


(2) 
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The  frequency  density  function  of  V  is 


✓27- 


end  the  probability  of  having  the  value  u  in  the  range  u  to  u  ♦  du  on  a  single  sample  is  f(u)du.  The 
mean  value  of  a  set  of  aaaplea  of  gaussian  noise  is  itself  normally  distributed  about  the  population  mean 
with  a  frequency  density  function  f  (V) ,  which  for  8  ssmples  is 

IB 

2  -  4 v2 

fB(V)  ■  -pz  e  (3) 


The  pixel  CE  also  contains  a  noise  sai^le  which  may  be  shifted  to 
and  subtracted  from  the  mean. 


The  frequency  density  function  f ^  of  the  thus  modified  mean  is  the 


fx(V) 


A 


(♦)  *  f (♦  -  V)  d* 


the  right  of  equations  (1)  and  (2) 
convolution 


(4) 


This  evaluates 
fjiV)  - 


to  a  gaussian  distribution 

v2 

1  2.25 

-  e 

✓017 


(5) 


Given  a  signal  of  amplitude  V<f  in  the  central  element  the  probability  of  detecting  it  with  the  first 

algorithm  is  the  probability  that  V_  >  V  '  +  C  which  is 

I  m 


Pdl'(VT) 


f 


VT+C 
f1(x)  dx 


-x 


(6) 


and  the  probability  of  a  false  alarm  on  noise  in  a  single  sample 


Pfal 


y  f^*)  dx 


(7) 


The  frequency  density  function  of  Vg  -  Vi  may  be  derived  as  follows.  The  probability  of  taking  a 
sample  in  the  range  u  -  u  +  du  is  f(u)du  and  in  the  range  w  -  w  +  dw  is  f(w)dw.  The  probability  that  a 
sample  will  be  taken  between  u  and  w 


8 


/ 


w 

f(x)  dx 


u>du 


(8) 


Consequently  the  probability  that,  in  selecting  N  samples,  one  will  lie  in  the  interval  du,  one  in  the 
interval  dw  and  the  remaining  N-2  in  the  intervening  interval  is 


dP  -  N(N  -  1)  f(u)f(W)gN"2  dudw 


(9) 


since  N(N  -  1)  is  the  number  of  ways  in  which  one  of  the  values  may  fall  in  each  of  the  intervals  du  and 
dw.  Rearranging  by  making  v  -  w  -  u  and  summing  over  all  intervals  du  the  frequency  density  function  fgjj 
of  Vn  -  is 

fRN(r)  *  N(N  "  1}  f  £<u>f<u  +  r>  (BO*  +  R)  -  g(u))N"2  du  (10) 

This  is  a  skew  distribution  function  with  a  mean  value  for  K  ■  8  of  2.845. 

The  frequency  density  function  of  thresholds  in  algorithm  2,  f2,  may  be  derived  by  convolving  (5) 
and  (10) 

f2(v)  -  f  fl(*)  *  d*  (id 

This  integral  must  be  solved  numerically. 

The  probability  of  detecting  a  signal  VT  with  this  algorithm  is  the  probability  that 
VT  >  Vm'  ♦  K(Vg  -  Vt)  which  is 
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Pd2(V  ) 
T 


/ 


f2(x)  dx 


and  the  probability  of  false  alarm  a  noise  is 
0 

/ 

Pfa2 


/ 


f2(x)  dx 


(12) 


(13) 


For  an  alarm  to  be  registered  both  of  algorithms  (1)  and  (2)  must  be  satisfied.  A  suitable  choice 
of  K  for  spatial  discrimination  is  between  1  and  1.5,  C  is  chosen  to  set  the  noise  false  alarm  rate,  see 
Figure  3. 


TABLE  1 

Track  Association  Rules 


ACTION  TAKEN  FOR 


PLOTS  IN 

SHORTLIST 

ORIGINAL  PREDICTION 

WINDOW  SIZE 

DOUBLED  WINDOW 

SIZE 

0 

DOUBLE  WINDOW 

AND  TRY  AGAIN 

COAST  TRACK 

CONFIDENCE  -5 

1 

CONFIDENCE  +10 

CONFIDENCE  +5 

2 

TAKE  NEAREST  TO 
PREDICTED  COORDINATE 
CONFIDENCE  +5 

TAKE  NEAREST 
CONFIDENCE  +5 

3 

TAKE  NEAREST 

CONFIDENCE  +5 

TAKE  NEAREST 

CONFIDENCE  +5 

>3 

CONFIDENCE  -10 

CONFIDENCE  +5 

PREDICTION  WINDOW  AX  =30  PIXELS 

SLOW  TARGET  RATE  LESS  THAN  1  PIXEL  MOVE  MENT 

PER  SCAN  (2 PIXEL  INAZIMUTH) 

CONFIRMATION  REQUIREMENTS 

CONFIDENCE  =  25 
COAST  FLAG  UNSET 
SLOW  TARGET  FLAG  UNSET 
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THE  ALGORITHM  :  CENTRE  ELEMENT  >  MEAN  ♦  K(MAXIMUM 

-MINIMUM  ) 

PARAMETER  K  IS  USUALLY  SET  IN  THE  RANGE  1  -  2 


1  eg  cloud  edge 


K  >0.625 
NO  ALARM 


2  eg  horizon 


CE  =1.1  MEAN  =  0.7 
MAX  =1,1  MIN  =-0.3 


K  >0.  2  85 
NO  ALARM 


3  eg  fence  post  top 


0 

0 

0 


0  0 

0 


0.1 


CE  =0,9  MEAN  =0.2 
MAX  =  0,9MIN  =  0.1 


Fig.5  Examples  of  the  3  x  3  window  spatial  filter 
and  its  ability  to  reject  features 


K  >0,875 
NO  ALARM 


l 


TARGET  SUBTENSE  Ml  LLIRADIA  NS 


Fig.7  Variation  in  effective  (non-adaptive)  threshold  C  with  target  subtense  window  2 
based  on  80%  acceptance  by  adaptive  threshold  K 
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Summary 

An  operational  system  is  described  which  tracks  moving  objects  in  an  image,  and  servoes  a  camera  platform 
to  continue  tracking.  The  system  uses  a  fast  hardware  contrast  evaluation  logic  supplying  contour  coordi¬ 
nates  to  a  supervisory  control  computer  for  further  processing.  Contrast  evaluation  is  restricted  to  within 
a  rectangular  window,  which  is  adapted  to  the  object  size  automatically.  The  mini-computer  being  free  from 
contour  finding  has  enough  time  during  one  frame  (20  msec)  to  process  the  contour  data  from  the  preceding 
frame.  Essential  criteria  in  treating  problem  situations  are  the  changes  in  contour  from  the  present  frame 
to  past  frames  after  isolating  the  object  motion  from  the  superimposed  camera  motion.  They  are  processed 
in  a  kind  of  heuristic  truth  tables.  Improvements  of  the  tracker  are  achieved  merely  by  software  refine¬ 
ments,  that  is  by  application  of  more  elaborate  algorithms. 

Originally  the  tracker  was  designed  for  ground-to-air-tracking,  meanwhile  however  ground-to-ground-tracking 
was  demonstrated  for  realistic  situations  too.  Corresponding  film  material  is  shown  at  the  conference. 


1.  Introduction 

Imaging  sensors  are  expected  to  become  an  increasingly  important  type  of  sensor  for  closed-loop  controllers 
in  different  kinds  of  "intelligent"  systems  such  as  occur  in  robotics  or  missile  guidance.  Automatic 
tracking  of  moving  objects  with  a  computer-controlled  TV-camera  is  one  of  the  outstanding  problems  in  this 
area.  Digitizing  the  video  image  and  processing  it  then  of  course  gives  the  greatest  flexibility  that  can 
be  thought  of;  however  systems  of  this  kind  so  far  are  expensive  and  tend  to  consume  a  lot  of  computer 
time;  therefore  real-time  capabilities  are  very  difficult  to  attain. 

The  operational  system  we  are  presenting  here  comes  along  without  digitizing  the  pictures.  Instead  it  uses 
a  contrast  evaluation  logic  as  a  fast  interface  to  the  control  computer  so  that  real-time  tracking  is 
ascertained  with  a  high  degree  of  reliability.  The  intelligence  of  the  system  is  based  on  elaborate  con¬ 
trol  programs  that  check  the  contour  coordinates  for  plausibility  and  try  to  make  reasonable  decisions 
in  problem  cases. 

As  is  well  known,  the  most  challenging  problem  in  tracking  are  occlusions.  The  target  may  occasionally  be 
either  partially  or  totally  occluded.  The  system  presented  here  is  to  show  that  by  elaborate  computer 
processing  the  occlusion  problem  can  be  handled  in  the  non-digitized  case,  too. 

2.  The  DFVLR  TV-Tracker 

2.1.  The  overall  system 

Figure  1  shows  a  simplified  block  diagram  of  the  tracking  system.  The  video  signal  is  supplied  by  a  black 
and  white  gimballed  camera  and  displayed  on  a  color  monitor.  (The  color  functions  of  the  monitor  are  used 
to  overlay  the  window  location  and  threshold  boundaries. )  At  the  same  time  the  analog  video  signal  is  com¬ 
pared  in  the  signal  processing  logic  with  an  adjustable  grey  value  level.  Whenever  the  amplitude  of  the 
video  signal  crosses  this  threshold,  either  rising  or  falling,  the  corresponding  line  number  and  pixel 
number  in  this  line  are  registered  from  counters  synchronized  with  the  frame  and  line  sync  pulses.  At  the 
end  of  the  line  these  values,  having  been  accumulated  in  a  fast  memory,  are  transmitted  to  a  process 
computer. 

Signal  evaluation  (from  the  hardware)  occurs  only  within  a  window  somewhat  greater  than  the  target  exten¬ 
sion.  This  window  is  normally  adapted  to  the  object  size  automatically  by  the  conputer,  taking  into 
account  the  predicted  target  position  to  make  sure  the  target  never  touches  the  window  edge.  The  computer 
then  has  the  task  of  calculating  the  center  and  extensions  of  the  target  by  the  given  contour  values  and 
servoing  the  camera  so  as  to  null  the  deviation  between  object  center  and  screen  center. 

The  level  I  tracker  is  characterized  by  an  enormous  reduction  of  information  flow.  It  uses  only  the  first 
and  last  threshold  values  in  one  line  (inside  the  window)  and  it  actually  does  not  process  all  values  of 
the  object  contour,  but  only  the  extrema  in  the  horizontal  (x)  and  vertical  (y)  directions.  So  only  four 
values  are  actually  encoded  in  each  frame  and  used  for  decision  making.  These  four  extrema  may  be  due  to 
the  target  contour,  or  they  may  stem  from  occlusions  entering  the  window.  It  is  the  task  of  the  logical 
decision  part  of  the  computer  program  to  cancel  implausible  values  and  replace  them  by  estimated  ones. 

2.2.  System  Hardware 

The  main  task  of  the  signal  processing  logic  is  to  register  the  contour  and  position  of  an  object  in  the 
TV  picture  and  transfer  this  data  to  the  computer.  Except  for  the  time  required  for  data  transfer,  the 
computer  ist  not  involved  in  the  image  processing  function. 

The  functioning  of  the  signal  processing  unit  is  shown  in  Figure  2.  Two  versions  of  the  system  are  illus¬ 
trated  in  this  figure.  The  level  I  system  can  handle  only  one  object  (threshold  crossing)  per  line,  whereas 
the  level  2  system  can  handle  up  to  32  objects.  All  internal  si gnals  operate  synchronously  with  the  TV 
camera  timing. 

2.2.1.  The  Level  I  System 

The  position  of  any  point  inside  the  TV  pitcture  is  described  by  its  line  number  and  pixel  point  number. 
Consequently,  the  analog  signal  must  be  sampled  at  discrete  tisies  to  provide  a  unique  pixel  point  number 
for  each  point  on  a  scan  line.  For  this,  a  crystal  oscillator  is  used  which  provides  pixel  spacing  equal 
to  line  spacing.  Taking  into  account  the  3x&  aspect  ratio  of  the  camera,  this  results  in  a  clock  fre¬ 
quency  of  13  MHz.  The  pixel  point  counter  is  clocked  by  the  output  of  this  oscillator  and  reset  by  the 
horizontal  sync  pulse. 
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In  an  analogous  manner,  the  line  counter  ia  incremented  by  the  horizontal  sync  pulae,  and  reset  by  the 
vertical  sync  pulse.  Since  both  counters  operate  synchronously  with  the  camera  scan,  the  position  of  the 
electron  beam  is  always  known. 

An  object  which  differs  significantly  from  the  background  in  intensity  results  in  a  corresponding  change 
in  the  analog  voltage  from  the  camera.  This  voltage  is  compared  with  a  reference  threshold,  and  whenever 
the  two  are  equal,  a  comparator  changes  state.  The  threshold  voltage  can  be  fixed  in  advance  by  an  opera¬ 
tor  or  set  to  follow  the  average  intensity  of  the  window. 

With  the  rising  edge  of  the  comparator  signal,  the  contents  of  the  pixel  point  counter  are  latched.  This 
number  then  corresponds  to  the  left  boundary  of  the  object.  The  right  object  boundary  and  corresponding 
line  number  are  latched  with  the  falling  edge  of  the  comparator  output,  and  may  be  overwritten  by  succes¬ 
sive  falling  edges  in  the  same  line.  So  only  the  last  one  is  retained.  The  data  are  then  transferred  into 
buffer  registers  with  the  horizontal  sync  pulse,  and  a  data  transfer  to  the  computer  is  initiated.  By 
using  these  buffer  registers,  the  computer  is  able  to  perform  the  read  operation  at  any  time  during  the 
next  line. 

When  the  new  window  position  is  computed,  the  positions  of  the  four  sides  are  transferred  to  a  buffer  re¬ 
gister,  and  with  the  next  vertical  sync  pulse,  this  data  is  transferred  to  the  inputs  of  a  digital  cosr- 
parator.  This  comparator  compares  the  beam  location  (contents  of  the  two  counters)  with  the  window  coordi¬ 
nates,  and  permits  an  output  from  the  analog  comparator  only  if  the  beam  ia  inside  the  window.  With  this 
signal  from  the  digital  comparator  ("inhibit"),  the  window  is  marked  on  the  monitor. 

2.2.2.  Level  2  System 

As  mentioned  above,  with  the  second  version  of  the  system,  it  is  possible  to  record  more  than  two  thres¬ 
hold  crossings,  and  consequently  more  than  one  object  per  line.  The  existing  electronics  were  augmented 
by  the  addition  of  a  first-in-first-out  (FIFO)  memory.  The  pixel  point  and  line  counters  provide  input  to 
this  memory.  With  the  trigger  signal  from  the  analog  comparator,  the  appropriate  counter  values  are  read 
into  the  FIFO  and  at  the  end  of  the  line  (horizontal  sync  pulse),  data  transfer  to  the  computer  is  initi¬ 
ated.  This  data  transfer  is  performed  by  a  DMA  channel  which  works  by  cycle  stealing.  Therefore,  additional 
CPU  time  is  not  required  even  though  significantly  more  data  is  available  (as  compared  to  level  I). 

2.3.  Structure  of  the  earners  control  loop 

In  this  section,  the  closed  loop  for  camera  control  (in  x  and  y)  is  briefly  outlined  in  its  dynamical  be¬ 
havior.  An  essential  implication  is  the  temporal  relation  between  scene  perception,  processing  and  reali¬ 
zation  of  camera  motion,  (table  I). 


acceptance  of  contour 
coordinates  per 

1  i  no  i  n  fra  max  U  - 

processing  of  the 
extrema  from  frame  k-l 

11IIC  111  llallK  A  • 

determination  of 

concurrent 

computation  of  window  and 
camera  position  for  frame 

for  frame  k 

Table  I.  Software  activities  while  frame  k  is  recorded 


With  each  frame  starting  pulse,  the  same  operations  are  initiated,  which  is  analogous  to  a  normal  sampled 
data  system.  While  frame  k  is  recorded  from  the  video  signal,  the  computer  is  processing  the  contour  extre¬ 
ma  of  frame  k-l.  This  program  is  concurrently  interrupted  by  the  transmission  of  the  contour  values  of  the 
running  line  of  frame  k.  Inherent  with  this  transmission  is  a  min-max  search  so  that  at  the  end  of  frame  k 
the  corresponding  extrema  are  known.  Processing  of  frame  k-l,  being  a  background  job  is  also  finished  at 
the  end  of  frame  k;  the  computer  has  by  the  end  of  frame  k  checked  the  suppositious  target  motion  for  plau¬ 
sibility,  corrected  it  if  necessary,  predicted  the  target  position  two  frames  in  advance,  and  calculated 
window  coordinants  and  camera  position  for  frame  k+1. 


Consequently,  by  the  time  data  transmission  from  the  window  in  frame  k+1  starts,  the  camera  has  been 
cosmanded  to  move  (by  a  command  initiated  after  analyzing  frame  k-l),  and  will  be  in  the  correct  position. 


In  a  control  theoretical  sense,  the  plant  here  consists  of  a  delay  made  up  by  two  sampling  periods  (i.e. 
frame  periods).  In  the  z-transform  domain  often  used  with  sampled  data  systems,  a  delay  of  2  periods  is 
characterized  by  a  factor  z”2,  while  a  prediction  of  two  periods  means  multiplication  by  z^  (see  Figure  3). 


As  on  the  screen,  only  the  relative  position  between  target  position  and  camera  position  is  measured.  To 
deal  with  occlusions,  it  is  necessary  to  get  the  corrected  target  position  by  adding  the  camera  motion. 

The  problem  solving  and  processing  program  (see  Figure  3)  tries  to  solve  this  task  in  "problem"  cases  too, 
and  then  predicts  the  ("true”)  target  position  two  frames  ahead  in  order  to  catch  up  the  inherent  delays. 
After  processing  occlusions  and  estimating  velocities  (if  necessary),  the  effects  of  camera  motion  are  re¬ 
inserted  to  provide  an  error  signal  for  the  camera  servo.  A  digital  low  pass  filter  processes  this  error 
signal  first.  This  filter  is  charged  with: 


a)  smoothing  the  small  contour  jumps  from  frame  to  frame  which  are  due  to  the  interlaced  scanning  tech¬ 
niques  used  in  television. 


b)  supplying  sufficient  phase  reserve  ahead  of  the  position-error  integrator  in  the  camera  control  circuit. 

The  dynamics  of  the  closed  loop  are  adjustable  with  the  gain  V  (from  Figure  3),  so  that  the  closed  loop 
poles  yield  sufficient  stability.  As  the  system  contains  only  one  integration,  an  object  moving  with  con¬ 
stant  velocity  generates  a  stationary  position  error  that  has  no  major  influence  on  the  tracking  capabili¬ 
ty. 


2.4.  Treatment  of  problem  cases  (e.g.  occlusions) 

The  most  critical  part  of  the  control  program  has  the  task  of  checking  the  measured  (apparent)  object  ex¬ 
trema  for  plausibility.  For  this  purpose  it  uses: 

a)  the  past,  digitally  filtered  speed  values  of  the  contour  extrema  xmax/min,  ymax/min  and  the  computed 
overall  speed  of  these  in  x  and  y. 

b)  the  object  extension  averaged  over  several  frames. 
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Processing  a  frame  implies  checks  for  two  kinds  of  disturbances.  (See  Figure  A). 

a)  a  jump  disturbance,  which  occurs  when  suddenly  one  or  several  contour  extrema  (in  the  following  abbre¬ 
viated  as  CE)  coincide  with  window  edges;  in  this  case  it  can  be  assumed  that  an  occlusion  has  entered 
the  window,  and  that  part  of  the  occlusion  overlaps  the  edge  of  the  window  of  course,  for  this  to  work, 
the  real  target  size  never  should  exceed  the  edges  of  the  window.  The  window  size  it  continuosly  moni¬ 
tored  to  insure  that  this  assumption  is  true. 

b)  a  velocity  disturbance,  which  occurs  when  the  measured  interframe  change  of  a  contour  extremum  (CE) 
does  not  have  the  same  sign  as  the  computed  overall  velocity. 

Taking  account  of  the  typical  structures  of  occlusions  (trees,  poles,  horizon),  different  algorithms  were 

developed  for  the  x  and  y  directions,  which  in  condensed  form,  are  as  follows: 

x-direction 

In  case  of  a  jump  disturbance  or  a  velocity  disturbance  the  velocity  of  the  disturbed  CE  is  not  updated; 
if  both  CEs  are  disturbed,  pure  prediction  is  performed,  i.e.,  window  size  and  speed  keep  their  old 
values.  Moreover,  in  the  case  of  a  velocity  disturbance  in  one  CE,  the  disturbed  CE  is  estimated  from 
the  latest  object  size  and  the  other  (undisturbed)  CE.  Object  size  is  updated  only  in  undisturbed  cases. 

y-direction 

Here  only  jump  disturbances  are  checked.  When  a  disturbance  is  recognized,  the  corresponding  CE  is  again 
estimated  from  the  undisturbed  one  and  updating  of  CE  velocity  is  omitted.  If  both  CEs  are  disturbed, 
prediction  is  used. 

If  the  growth  of  the  y-extension  taken  over  several  frames  exceeds  a  threshold,  then  that  CE  which  is  the 

leading  one  with  respect  to  object  velocity  is  used  to  briefly  (i.e.  for  one  frame)  draw  after  it  the 

other  CE.  Then  (in  the  next  frame)  it  will  be  determined  whether  the  object  size  actually  increased  so 
■such  or  whether  it  must  overcome  an  occlusion  stationary  with  respect  to  y. 

For  example,  in  Figure  A,  jump  disturbances  and  velocity  disturbances  are  sketched  for  the  x-coordinant 
direction.  The  jump  disturbance  at  the  left  CE  indicates  that  an  occlusion  has  entered  the  window-  pre¬ 
suming  of  course,  that  the  target  does  not  touch  the  window  edges.  So  for  velocity  updating,  this  CE  is 
no  longer  used  but  tries  to  move  to  the  end  of  the  occlusion  and  to  "wait"  for  the  target  there.  Similar 
jump  disturbances  in  y  however  would  ismediately  fix  the  window  in  y.  This  scheme  takes  account  of  the 
fact  that  in  many  practical  cases  nearly  horizontal  movements  coincide  with  vertical  structured  occlu¬ 
sions,  for  exasple,  trees  and  buildings. 

Registration  of  velocity  disturbances  as  indicated  in  Figure  A  assures  that  indepent  of  whether  the  target 
reverses  its  direction  behind  an  occlusion,  the  window  is  finally  detached  from  the  occlusion. 

With  level  2  hardware  as  described  in  chapter  2.2.  all  threshold  crossings  within  the  window  are  registra- 
ted.  For  the  tracking  purposes  the  software  remain  nearly  as  described,  however  if  there  are  more  than  one 
threshold  crossings  in  a  line  the  run-lenght  data  are  checked  for  plausibility  and  those  segments  are  re¬ 
jected,  the  center  of  which  is  not  within  the  predicted  contour  extrema.  This  improves  performance  espe¬ 
cially  in  case  of  oblique  occlusions  entering  the  window. 

Though  the  tracker  initially  was  designed  only  for  ground-air  tracking  with  good  contrast,  it  proved  in 
field  experiments  to  show  good  tracking  capabilities  in  ground-ground  tracking  with  occlusion  situations 
too.  Thus  it  can  be  shown  that  by  intelligent  processing  of  dramatically  reduced  information  (A  values 
per  frame)  needing  only  30  -  AO  Z  of  real  time  on  the  computer,  surprisingly  good  results  could  be  achieved. 


3.  Conclusion 

The  system  presented  here  shows  that  by  implying  elaborate  computer  programs  it  is  possible  to  achieve 
good  tracking  capabilities  although  the  TV-image  is  not  digitized.  Nevertheless  it  must  be  stated  that 
the  algorithms  are  "synthetic",  i.e.  are  based  on  ideas  of  how  we  intuitively  perceive  that  a  computer 
could  be  directed  to  perform  the  tasks  in  question.  No  learning  capability  is  provided.  Thus  in  future 
our  main  interest  will  be  in  how  to  make  machines  -  like  the  one  presented  here  -  change  their  behaviour 
from  experience  made  in  the  past,  i.e.  introduce  a  learning  phase  in  which  a  set  of  typical  situations  is 
shown  to  the  tracker.  Again  this  problem  is  to  a  great  extent  independent  from  whether  a  picture  is 
digitized  or  not. 
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RESUME 


La  poursuite  de  cibles  que  nous  avons  AtudiAe  s'appuie  sur  une  met hod e  de  correlation  logique  d' images. 


La  cible  formAe  de  quelques  dizaines  de  pixels  peut  Atre  caractArisee  par  des  points  situAs  aux  angles  de 
frontiAres  importantes  de  density. 


La  major itA  de  ces  points  sont  presents  dans  des  images  successives  et  ont  le  meme  vecteur  de  translation. 
1 1 3  sont  done  diffArenciAs  du  fond. 


Par  cette  methode,  la  correlation  est  rapide,  efficace,  mArae  sur  fond  perturbA.  Elle  est  aussi  "intelligente" 
car  Involution  du  nombre  de  points  permet  souvent  d'en  connattre  la  cause  et  de  choisir  l'algorithme  le 
mieux  adapts  pour  essayer  d'Aviter  un  dAcrochage  Aventuel  de  la  poursuite. 


INTRODUCTION 


Le  but  de  l'Atude  a  AtA  de  rAaliser  un  systAme  permettant  de  corrAler  deux  images  de  fagon  automatique ,  le 
couple  d* images  pouvant  Atre  quelconque.  Mais  trAs  rapidement,  on  s'est  apergu  que  la  mAthode  utilisAe  Atait 
trAs  intAressante  dans  le  cadre  de  la  poursuite  de  cible  de  part  sa  rapiditA,  sa  quality  dans  le  rAsultat 
et  son  "intelligence"  dans  la  separation  de  la  cible  et  du  fond. 


Le  principe  fondamental  repose  sur  une  correlation  logique  de  deux  images  successives.  Les  deux  images  sont 
rAduites  A  deux  listes  de  points  caracteristiques  de  coordonnees  (x,y)  prenant  des  valeurs  fonction  du  con¬ 
texts  local.  La  comparaison  de  ces  deux  listes  nous  permet  de  creer  deux  classes,  voire  plus,  la  principale 
correspond  A  la  cible,  la  seconde  au  fond  (ou  a  une  deuxiAme  cible  en  cas  de  poursuite  multicibles) ,  les 
autres  classes  n'Atant  pas  significatives. 


Les  points  caractAristiques  peuvent  Avoluer  dans  le  temps.  Un  test  de  conf lance  doit  Atre  appliquA  afin  de 
savoir  si  : 


-  la  corrAlation  est  considArAe  coome  bonne 

-  le  rAsultat  n'Avolue  pas  vers  une  solution  de  dAcrochage  auquel  cas,  il  est  nAcessaire  de  pas¬ 
ser  en  mode  mAmoire. 


Dans  ce  document  sont  dAcrits  : 

-  Les  principes  gAnAraux  de  la  mAthode  utilisAe 

-  L' application  de  ces  principes  dans  le  cadre  de  la  poursuite  de  cibles 


-  Le  matAriel  "poursuite 
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I  -  PRINCIPES  G&NERAUX  DE  LA  METHQDE 
1-1.  G4n6ralit6s 


Nous  nous  proposons  d'associer  par  couples  les  points  repr^sentatifs  de  l'objet.  Ces  points  seront 
dits  "points  caracteristiques". 

L'id«§e  directrice  de  la  methode  est  bas^e  sur  l'hypothese  que  l'oeil  detecte  dans  un  premier  stade 
les  lignes  separant  des  surfaces  de  teintes  differentes  et  en  particulier  les  sommets  des  angles  formas 
par  ces  lignes. 

Hypoth^se  :  les  deux  photos  ou  deux  parties  bopologues  de  la  photo  se  correspondent  par  une  translation 
ou  de  faibles  deformations  (inferieures  au  pas  d' analyse) . 

Dans  une  numerisation,  la  taille  du  spot  etant  sup£rieure  £  la  distance  entre  deux  positions  con- 
secutives,  une  fronti£re  est  definie  par  trois  points  : 


Un  point  caracteristique  peut  §tre  defini  par  neuf  points.  Le  point  lui-meme  et  les  huit  qui  l'en- 
tourent. 

Nous  allons  essayer  d'extraire  les  points  caracteristiques  de  la  photo  en  faisant  des  considera¬ 
tions  locales  sur  ces  9  points. 

La  methode  comporte  deux  parties  : 

-  extraction  des  points  caracteristiques , 

-  correlation. 


1-2.  Recherche  des  points  caracteristiques 

Nous  avons  defini  les  points  caracteristiques  comme  des  points  de  rupture  sur  une  frontidre. 
Ces  points  doivent  done  verifier  deux  conditions  : 

-  se  trouver  sur  une  frontidre,  done  la  variance  du  point  doit  Stre  forte, 

-  la  frontiere  doit  former  un  angle  en  ce  point. 

Un  point  caracteristique  peut  se  presenter  de  plusieurs  fagons  : 


Si  un  point  est  caracteristique,  il  se  peut  qu'un  ou  plusieurs  de  ses  voisins  le  soit  6galement. 
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Dans  la  figure  ci-contre,  4  points  peuvent  §tre  caract^ristiques 
suivant  la  definition  pr6c£dente. 


Probleme  :  Soit  un  point  de  1 ' image  et  ses  8  voisins. 

Ce  point  est-il  caract£ristique  ? 

11  faut  trouver  un  op^rateur  qui  puisse  repondre  &  cette  question. 


On  peut  verifier  les  performances  de  l'operateur  utilise  d'une  fagon  tres  simple  :  on  determine 
sur  une  photographie  ou  sur  une  restitution  sur  impr imante  Les  points  qui  a  l'oeil  semblent  les  plus 
caracteristiques .  Si  l'operateur  retrouve  une  grande  par tie  de  ces  points,  il  peut  Stre  consider^  comme 
valable. 


Choix  de  l'operateur 

1.  Si  le  point  est  caracceristique,  il  doit  se  trouver  sur  une  frontiere.  Nous  allons  calculer  la  va¬ 
riance  du  point  : 


(  .  <-r-T  (T',  .  -  m' ) 

i*l,3  3*1,3  13 


9  ^  i  T  ij 


Les  T'^  6tant  les  nouvelles  transparences  apres  normalisation. 


2.  La  frontiere  doit  former  un  angle  en  ce  point. 

C'est  la  partie  la  plus  delicate  de  cette  partie  du  programme.  Il  est  difficile  de  trouver  un  algo- 
rithme  simple  qui  tienne  compte  de  toutes  les  possibilit6s.  On  est  aussi  parfois  incapable  de  pren¬ 
dre  une  decision. 


Le  point  central  est-il  caracteristique  ? 

De  tels  cas  peuvent  se  produire  si  l'ordre  de  grandeur  des  de¬ 
tails  de  la  photo  est  le  pas  de  num^risation  et  aussi  s'il  existe 
des  points  aberrants. 


D'aprds  la  partie  1  la  frontiere  existe  et  passe  par  le  point  central. 

Il  faut  determiner  pour  les  8  points  qui  l'entourent  les  2  qui  ont  la  plus  forte  probability  de  se 
trouver  sur  la  frontidre. 

Nous  allons  utiliser  un  codage  de  Freeman  pour  repr£senter  le  carry  3x3: 

Dire  que  le  point  d' indice  i  se  trouve  sur  une  frontiere  cela 
signifie  que  la  transparence  en  ce  point  est  assez  proche  de 
celle  du  point  central  et  que  la  pente  est  forte. 


Nous  allons  donner  deux  dyfinitions  :  i C  {0,7} 

Soit  T{i)  la  transparence  au  point  i  et  Tc  la  transparence  au  centre. 

On  appelle  gradient  circulaire  le  nombre  G<i)  =T(i+  1)  -T(i  -  1)  ;  i+  1  et  i  -  1  ytant  calcuiys 
modulo  8. 

On  appelle  gradient  par  rapport  au  centre  le  nombre  g(i)  *  T(i)  -T 

c 

D  *  oil  : 


G  ( 1) 

=  T  (2) 

-  T  (0) 

G(7) 

»  T  (0) 

-  T(6> 

G  (0) 

*  T  ( 1 ) 

-  T(7) 

3 

1 

1 

& 

c 

0 

S 

6 

7 

et  g ( 1 )  *  T( 1 )  -  T 
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Definition  :  un  point  du  pourtour  sera  dit  point  fronti£re  si  : 

-  |G(i)  |  >  |g(i)  | 

-  il  est  un  des  extrema  de  la  fonction  K(i)  telle  que  ; 

-  K (i)  =  JjG(i)  |  -  |g(i)|J  x  signe  [g(u] 
si  l G (i) |  >  |g(i) | 

-  K (i)  *  0  si  | G ( i ) |  <  | g  ( i )  | 

Nous  allons  expliciter  cette  definition  et  donner  quelques  exemples. 

Considerons  un  angle  droit  num£ris£.  Du  fait  du  diam£tre  non  n£gligeable  du  spot  ,  la  reponse  num£- 


Les  points  d' indice  2  et  4  auront  des  valeurs  comprises  entre  T  .  et  T  mais  ces  valeurs  dependront 

min  max 

de  la  position  du  spot  par  rapport  aux  cdtes  des  angles.  Par  contre  les  valeurs  de  et  seront 

cons tan tes  <3  tr£s  peu  pr£s. 

Si  l’on  veut  estimer  une  pente  au  point  d' indice  2,  il  est  preferable  de  prendre  la  valeur  T(3)-T(l) 
plutfit  que  |T(2)  -  T  ( 1 ) |  +  | T  (3)  -  T(2)|,  par  exemple.  Cette  particularite  explique  le  choix  de  la  de¬ 
finition  du  gradient  circulaire. 


Il  faut  aussi  tenir  compte  de  la  transparence  du  point  central  (Tc> . 


Considerons  les  exemples  suivants  : 


>  %  A 


Dans  le  premier  cas,  il  semble  logique  de  faire  passer  la  frontiere  par  les  points  2  et  4  et  dans  le 
deuxi£me  cas  par  les  points  1  et  5. 

On  va  utiliser  le  gradient  avec  le  centre,  g  ,  pour  tenir  compte  de  ce  ph£nom£ne. 

La  premiere  partie  de  la  definition  |G(i) |  >  |g(i) |  impose  au  point  front i&re  d'avoir  une  valeur  plus 
proche  de  celle  du  centre  que  de  celles  de  ses  voisins.  Dans  le  cas  des  exemples,  le  point  3  en  par¬ 
ticular  ne  doit  pas  6tre  choisi. 

| G ( i )  |  -  |g  (i)  |  est  toujours  positive  ou  nulle.  Le  fait  du  multiplier  par  le  signe  de  G(i)  revient  & 
conserver  le  sens  de  la  pente  et  done  de  trouver  les  deux  points  fronti£res. 

Un  point  sera  caract£ris£  par  2  nombres  n\  et  n2  qui  sont  les  directions  des  vecteurs  de  Freeman  de 
la  fronti£re.  Pour  que  le  point  soit  caract£rist ique ,  il  suffit  que  ces  deux  nombres  soient  diff£- 
rents  de  4. 


Inl  -  n2|  #  4 
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A  ce  niveau  du  programme,  nous  poss6dons  deux  tableaux.  Le  premier  V(i,j)  contient  les  variances  et 
le  deuxiAme  les  nombres  caract^ristiques  des  frontiAres.  On  utilise  la  forme  N(i,1)=  10  +  n2, 

ce  qui  permet  un  gain  de  place  en  m^moire. 

Nous  avons  vu  prec^d eminent  que  si  un  point  6tait  caract£ristique,  certains  de  ses  8  voisins  le  seront 
aussi.  Pour  la  suite  du  programme,  il  est  n^cessaire  d'£liminer  ce  genre  de  situation.  Pour  cela,  nous 
allons  prendre  les  points  du  tableau  qui  sont  des  maximas  locaux  pour  la  fonction  variance. 

II  -  LE  PROGRAMME  POURSUITE 
II-l.  G4n6rallt6s 

Nous  utilisons  actuellement,  pour  r€aliser  les  acquisitions  de  donn€es,  la  camera  RETICON  RA 
50x50  contenant  2500  photodiodes  accompagnGe  de  son  €lectronique  de  commande. 

Les  donn£es  obtenues  A  l'aide  de  cette  camera  sont  enregistr€es  sur  bande  magn^tique  image  par 
image.  Cette  phase,  qui  ne  nous  permet  pas  de  travailler  en  temps  r€el,  n'a  pas  6t6  supprim4e  pour  les 
raisons  suivantes  : 

-  la  camera  n'est  pas  mont6e  sur  support  amovible,  done  le  traitement  total  en  temps  r£el  n'est 
pas  possible, 

-  il  est  plus  ais6  d'utiliser  des  bandes  magnet iques  pour  tester  les  programmes  que  de  faire  des 
acquisitions  A  chaque  passage, 

-  1* interface  cam6ra-unit6  centrale  d'ordinateur  ne  pose  aucun  problAme  et  son  6tude  n'a  pas  6t6 
le  but  principal  de  ces  travaux. 

Les  donnSes  sont  trait£es  par  le  programme  POURSUITE.  Ce  dernier  recherche  les  416ments  de  l'i- 
mage  permet tant  de  "reconnaitre"  l'objet  en  mouvement. 
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II-2.  DAroulement  du  programme  POURSUITE 

a)  Programme  Principal  -  P.P 

II  permet  d'appeler  les  sous- programmes  fonctionnels.  D'aprAs  les  rAsultats  obtenus  par  CORRE, 
il  calcule  le  emplacement  de  l'objet  et  les  coordonnAes  du  point  central  de  l'objet  dans  l1 image  sui- 
vante,  coordonnAes  fournies  par  In  suite  A  PCAR. 

b)  GRAD 


Ce  sous-programme  calcule  le  gradient  et  les  vecteurs  de  Freeman  en  cheque  point  de  1 1  image, 

c)  PCAR 


Ce  sous-programme  Atablit  une  liste  de  n  premiers  points  caractAristiques  trouvAs  par  recherche 
circulaire  autour  du  point  central  de  l'objet  A  suivre. 

Cette  mAthode  de  recherche  des  points  a  AtA  choisie  arbitrairement.  II  est  possible,  en  fonction 
des  besoins  utilisateurs,  de  crAer  des  sous-programmes  optionnels  PCAR  :  PCARl ,  PCAR2 ,  . ..,  basAs  sur 
d’autres  mAthodes  de  recherche  des  points  caractAristiques. 


d)  CORRE 


Parmi  tous  les  couples  de  points  caractAristiques  obtenus  pour  deux  images  successives,  CORRE 
decide  quel  est  1* ensemble  de  ces  couples  appartenant  A  l'objet,  decision  prise  en  considArant  les 
translations  entre  les  points  de  chaque  couple  : 

A  1'entrAe  de  ce  module  les  deux  photos  A  traiter  sont  reprAsentAes  par  deux  listes  de  points 

caractAristiques.  I Is  sont  dAfinis  par  les  coordonnAes  et  la  valeur  des  vecteurs  orientation. 

L'AlAment  fondamental  d'un  point  caractAristique  est  la  valeur  du  vecteur  orientation. 

La  premiAre  Atape  de  calcul  consiste  en  la  recherche  pour  chaque  point  caractAristique  de  la 

premiere  photo,  des  points  caractAristiques  de  la  deuxieme  photo  qui  ont  1 'orientation  la  plus 

proche . 

Pour  chaque  couple  ainsi  dAfini,  on  determine  la  valeur  du  vecteur  de  translation  qui  amene  le 
point  de  la  premiAre  photo  sur  son  homoiogue  de  la  deuxiAme  ;  on  explore  ainsi  tous  les  points  carac- 
tAristiques  de  la  premiAre  photo  et  on  dresse  le  tableau  des  vecteurs  de  translation  avec  pour  cha- 
cun  le  nombre  de  couples  de  points  qui  se  correspondent  suivant  celui-ci. 

Chaque  classe  regroupe  les  vecteurs  s'Acartant  de  +_  2  pas  d' analyse  (suivant  les  coordonnAes  x 
et  y)  autour  du  vecteur  moyenne,  celui-ci  Atant  recalculA  chaque  fois  qu'un  vecteur  nouveau  entre 
dans  la  classe. 

Ce  qui  permet,  si  les  premiers  vecteurs  de  translation  s'Acartent  trop  du  vecteur  translation 
moyenne,  de  ne  pas  crAer  de  classes  artif icielles. 

Chaque  point  pouvant  avoir  plusieurs  homologues,la  somme  des  AlAments  des  classes  de  vecteurs  de 
translati jn  est  en  gAnAral  supArieure  au  nombre  de  points  caractAristiques. 

On  Alxmine  arbitrairement  les  classes  ayant  moins  de  quatre  AlAments  en  les  supposant  non  repre¬ 
sentatives  du  mouvement  d'un  objet.  Si  sur  le  raster  de  50  x  50  il  y  a  plusieurs  objets  en  mouvement 
indApendants ,  ce  seuil  serait  A  reconsidArer. 

On  range  alors  les  classes  de  vecteurs  translation  en  fonction  dAcroissante  du  nombre  de  leurs 
reprAsentants.  Sur  les  films  que  nous  avons  traitAs,  on  obtenait  en  gAnAral  deux  classes,  exception- 
nellement  trois. 

Dans  les  cas  les  plus  frAquemment  rencontrAs,  la  premiAre  classe  peut  Acre  considArAe  reprAsen- 
tative  de  l'objet,  la  seconde  classe  reprAsentant  le  paysage. 

L’Avolution  du  nombre  de  points  des  classes  d'une  image  A  1’ autre  peut  nous  permettre  de  connai- 
tre,  dans  certains  cas,  les  causes  d'un  dAcrochage  Aventuei  d®  la  poursuite  et  de  remAdier  au  mieux  A 
cette  dAfaillance. 

Dans  sa  version  de  base,  le  sous-programme  de  corrAlation  fournit  au  program principal  la 
translation  (ici  en  nombre  de  photodiodes)  que  l'objet  a  subi  entre  les  deux  images.  Cette  donnAe  per¬ 
met  de  calculer  : 

-  le  dAplacement  de  l'objet  dans  1' image, 

-  la  position  de  l'objet  dans  1’ image  suivante.  Cel  a  se  traduit  par  une  inertie  de  la  camAra.  Les 
coordonnAes  de  son  centre  seront  fournies  A  PCAR  dans  la  passe  suivante  pour  la  recherche  cir¬ 
culaire  des  points  caractAristiques. 

-  le  dAplacement,  la  vitesse  et  1 ' accAlAration  tangent iels  de  l'objet  dans  la  scAne. 
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Emplacement  r6el_de  l'objet^sa  vitessei_son_acc616ration)^ 

Si  n  est  le  deplacement  observe  au  niveau  de  1' image,  alors  celui  qu'a  subi  l'objet  sur  la  sc£ne 

est  : 

TR  *  avec 

p  :  distance  centre  d  centre  de  deux  photodiodes  voisines 
D  s  distance  ob jet-camera 
f  :  focale  de  l'objectif 

Dans  le  cas  present  : 

f  -  12,5  mm  et  p  =  0,0975 

Ainsi  : 


L' image  dtant  sous  une  forme  discrete,  nous  avons  une  erreur  sur  la  mesure  de  la  translation. 
L'erreur  relative  est  : 

ATR  _  l_ 

TR  ”*  n 

Entre  deux  photos  non  successives  (en  ajoutant  les  translations  calcul£es  au  cours  de  chaque 
passe),  l'erreur  relative  est  au  maximum  dgale  d  : 

Atr  _  £ 

TR  *  n 

oO  l  est  la  largeur  (en  nombre  de  pas)  de  l’objet  dans  1* image. 

En  effet,  en  utilisant  cet  algorithme  de  correlation,  nous  ne  pouvons  reconnaltre  la  forme  de 
1' image.  Les  points  correids  d  1' instant  t  ne  sont  done  pas  ndeessairement  les  m§mes  que  ceux  correies 
d  1* instant  t '  (sauf  si  t'  -  t  =  to  est  le  temps  entre  deux  acquisitions  successives).  II  peut  exister 
un  "glissement"  des  points  correies  dans  l'objet,  glissement  limits  d  sa  largeur. 

D'apres  les  rSsultats  obtenus  jusqu'd  maintenant,  nous  n' avons  pas  observe  de  glissement  impor¬ 
tant  et  : 

Atr  i 

TR  %  n 

La  vitesse  est  determinde  si  l'on  connalt  le  temps  entre  deux  acquisitions.  L'erreur  relative 
sur  la  vitesse  sera  de  : 

AV  >  1_ 

V  'v  n 

De  m£me  pour  1 'acceleration  : 

A^  >  AV  V 2  +  Vi 

Y  %  V  X  V2  -  V, 


III  -  LE  MATERIEL  "POURSUITE" 

III-l.  Ggn6rallt6s 

Les  resultats  obtenus  d  l'aide  de  1 'algorithme  "POURSUITE"  ont  ete  satisfaisants.  Cependant, 
le  temps  d' execution  sur  l'ordinateur  de  l'E.T.C.A.  (CIMSA,  Mitra  15)  est  loin  du  temps  reel  :  1,5  se- 
conde  par  image  de  50  x  50  points  en  utilisant  deux  listes  de  20  points  caracteristiques  chacune. 


Nous  avons  entrepris  de  realiser  une  maquette  basde  sur  l'algorithme  prec6demment  d6crit. 
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La  maquette  peat  fttre  scindle  en  deux  corps  independents  :  ! 

-  lecture  de  1* image,  detection  et  listage  des  points  caracteristiques, 

-  correlation  des  points  caracteristiques  et  fourniuire  des  resultats. 

Seule  la  premiere  partie  est  en  cours  de  realisation. 


III-2.  Principe 

L' image  originale  provenant  de  la  camera  HETICON  est  codee  sur  8  bits.  Des  corrections  d'off- 
sett  permettent  de  s ' af franchir  des  decalages  des  tensions  de  reference  entre  chaque  photodiode. 

La  recherche  des  points  caracteristiques  se  fait  sur  une  image  binaire  obtenue  par  seuillage 
de  1* image  des  variances  afin  de  simplifier  le  c&blage.  Pour  realiser  une  telle  operation  sans  risque 
d'erreurs,  il  est  necessair e,avant  de  seuiller  l’image,  de  la  lisser  afin  d'eiiminer  tout  bruit. 

L*  algor ithme,  pour  obtenir  les  points  caracteristiques,  se  deroule  done  schema tiquement  de  la 
fagon  suivante  : 

-  Calcul  de  la  variance, 

-  Lissage  par  zone  par  rapport  h  la  variance  moyenne  locale, 

-  Seuillage, 

-  Detection  des  points  caracteristiques. 


Le  choix  de  la  taille  des  zones  se  fait  en  fonction  du  nombre  de  points  caracteristiques  don- 
n4s,  du  nombre  de  transitions  de  1 ’ image  lissee,  du  nombre  de  points  caracteristiques  obtenus  dans 
1 1  image  pr6cedente . 

I 

Le  seuil  est  fonction  des  variances  moyennes  locales  de  chaque  zone. 

Pour  detecter  les  points  caracteristiques,  chaque  point  peut  Stre  code  sur  8  bits  en  fonction 
de  son  vo is inage  : 


1 

2 

3 

1 

0 

0 

1 

1 

0 

1 

1 

0 

7  6  5 


1 

2 

3 

4 

5 

6 

7 

8 

1 

0 

0 

0 

0 

1 

1 

1 

II  suffit  de  comparer  chaque  mot  aux  mots  d'une  table  listant  tous  les  points  caracteristiques 
possibles. 

Le  resultat  va  €tre  code  sur  un  mot  de  8  bits  compose  de  0  sauf  dans  les  deux  directions  privi- 
legiees  oft  nous  y  trouvons  des  1. 


IV  -  CONCLUSION 


Cette  etude  a  perm is  de  mettre  au  point  un  algorithme  simple  de  poursuite  de  cibles  sur 
fond  perturbe,  et  de  cabler  les  parties  de  programme  les  plus  penalisantes  en  temps. 

La  validation  de  la  methode  a  6te  faite  en  utilisant  : 

-  des  films  que  nous  avons  numerises  vue  par  vue 

-  un  objet  maquette  en  salle. 

Ces  essais  ont  montre  que  le  syst&ne  fonctionnait  de  fagon  satisfaisante,  qu'il  est  trSs  prometteur 
quant  &  son  avenir  et  son  utilisation. 


jL 
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TARGET  TRACKING  USING  AREA  CORRELATION 


R.M.B.  Jackson 
EMI  Electronics  Ltd 
Victoria  Road 
Felthara,  Middx 


SUMMARY 


This  paper  describes  an  Image  Tracking  System  which  uses  an  area  correlation 
technique.  The  system  operation  and  method  of  implementation  are  described  and  a 
short  film  of  a  tracker  operating  is  shown. 

INTRODUCTION 

With  the  increasing  use  of  electro-optic  imagers  in  weapon  systems  for  aircraft, 
there  is  a  need  to  provide  an  automatic  track  of  targets  of  interest  to  relieve  the 
operator  -  who  may  be  the  pilot  -  of  this  task.  These  systems  must  provide  an  output 
of  the  line  of  sight  to  the  target,  relative  to  an  air  frame  datum,  with  the  necessary 
precision  and  band  width  to  allow  weapon  aiming,  command  and  control. 

The  method  described  here  is  designed  to  operate  on  video  which  provides  a  grey 
scale  image  of  the  spatial  pattern  of  the  target.  This  pattern  may  come  from  either 
a  TV  (daylight  or  low  light  level)  or  an  Infra  Red  (IR)  imager.  The  spectrum  used  is 
not  important  to  the  tracker  provided  a  spatial  and  grey  level  pattern  is  available. 

The  process  examines  the  match  of  a  stored  area  reference  pattern  with  the  patterns 
present  in  successive  fields  of  video,  applying  equal  weights  to  the  comparison  of  equal 
areas.  This  provides  considerable  immunity  to  effects  of  small  changes  in  the  pattern 
which  may  be  caused  by  smoke,  fluttering  flags,  branches,  small  obscurations  from  ground, 
and  so  on. 

For  the  purpose  of  this  paper  I  will  confine  the  description  to  a  tracker  which  uses 
video  provided  in  TV  format,  because  this  is  well  known  and  is  commonly  used  for  display 
purposes  for  IR  as  well  as  TV  imagers.  It  is  of  course  possible  to  design  the  tracker 
to  operate  with  any  format  providing  successive  fields  or  frames  of  video  have  a  known 
relationship  to  one  another. 

VIDEO  SAMPLING 


These  systems  use  digital  processing  of  the  video  waveform  so  the  first  process  is 
to  sample  the  video  and  digitise  the  amplitude  to  provide  a  digital  representation  of  the 
pattern.  This  process  is  shown  in  Figure  1.  The  video  comes  from  the  imager  in  two 
interlaced  fields  of  horizontal  lines.  The  odd  and  even  fields  are  shown  in  the 
'magnified'  circle. 

The  tracker  operates  in  each  field  separately.  This  avoids  storing  one  field  to  be 
combined  with  the  next  and  also  increases  the  rate  at  which  measurements  of  target 
positions  can  be  provided  as  an  output. 

Each  line  is  sampled  along  its  length  and  the  grey  level  or  amplitude  of  each  sample 
is  digitised. 

The  number  of  samples  taken  along  the  line  is  matched  to  the  horizontal  resolution 
available  from  the  imager.  We  normally  take  two  samples  for  each  cycle  of  resolution 
available  to  preserve  the  image  data.  A  typical  number  -  for  normal  TV  bandwidth  -  is 
512  samples  per  line. 

The  grey  level  is  digitised  to  six  bits  providing  64  grey  levels.  This  is  usually 
sufficient  for  the  type  of  video  obtained  from  FLIR  or  TV  imagers.  It  could  however  be 
increased  if  the  signal  to  noise  ratio  of  the  imager  warranted  the  resulting  increase  in 
hardware  and  power.  Some  earlier  machines  used  fewer  levels  -  typically  three  or  four 
bits. 

TARGET  DESIGNATION 


The  next  step  is  to  designate  the  target  to  the  tracker.  We  ask  the  operator  to  do 
this.  The  format  of  the  display  is  shown  in  Figure  2. 

Using  a  joystick  control  he  moves  the  small  box  and  crosswire  marker  over  the  target 
of  interest,  adjusts  the  box  for  an  approximate  fit  to  the  target  and  selects  automatic 
track. 


The  aim  point  he  has  selected  may  be  adjusted  while  the  machine  is  tracking  -  without 
losing  lock  -  if  he  wishes  to  refine  its  position  later  in  the  engagement. 


The  size  of  the  box  may  also  be  readjusted  during  automatic  tracking  if  he  wishes 
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AREA  CORRELATION 


The  machine  now  stores  from  the  video  at  the  time  auto  track  is  selected  the  pattern 
contained  in  the  area  of  the  field  of  view  designated  by  the  small  box. 

This  is  stored  in  the  Inner  Patch  or  Target  Data  store  and  is  used  as  the  memory  of 

the  target  pattern  for  the  tracking  process.  As  we  have  seen  this  is  a  three  dimensional 
pattern  of  spatial  and  grey  level  descriptors  of  the  target.  This  is  shown  in  Figure  3- 

The  size  of  the  target  patch  has  been  selected  to  suit  the  size  of  the  target 
pattern.  However  it  is  not  necessary  to  fit  it  precisely  to  the  target  outline  and  some 
background  in  the  patch  will  not  cause  loss  of  lock.  This  is  because  of  the  area 
averaging  effects  of  the  correlation  process. 

It  is  not  necessary  either,  to  include  the  edges  of  the  target  within  the  patch. 

If  there  is  a  spatial  and  grey  level  pattern  resolved  within  the  target  outline  this  can 
be  tracked  and  this  may  well  be  the  case  if  the  target  image  is  large. 

Typical  sizes  used  for  the  Target  Data  patch  range  from  9  x  9  to  23  x  23  samples. 

It  may  of  course  be  rectangular  rather  than  square  if  required. 

On  succeeding  fields  of  video  information  another  store  -  the  Outer  Patch  store  is 
loaded  with  data  surrounding  the  last  known  position  of  the  target.  This  is  also  shown 
in  Figure  3» 

The  size  of  this  store  is  selected  in  the  design  to  suit  the  dynamics  of  the  system 
requirements.  It  must  be  large  enough  to  be  sure  that  the  target  will  not  move  out  of 
this  area  of  the  field  of  view  within  one  field  time  of  the  imager.  This  movement  may 
be  caused  either  by  movement  of  the  target  within  the  background  or  of  course  by  movement 
of  the  boresight  of  the  imager  due  to  aircraft  movement,  vibration  and  so  on. 

A  search  radius  is  usually  defined  from  these  considerations  and  the  store  is 
designed  to  allow  the  largest  Inner  Patch  to  excurse  over  the  area  defined  by  this 
radius.  A  typical  figure  for  a  low  flying  aircraft  attacking  tank  targets  is  a  search 
radius  of  10  samples. 

The  correlation  process  is  now  carried  out  over  the  search  area  by  moving  the  Inner 
Patch  one  sample  at  a  time  to  each  possible  position  within  the  Outer  Patch.  This  is 
shown  in  Figure  4.  At  each  position  a  score  for  the  match  of  each  sample  in  the  Inner 
Patch  with  the  corresponding  sample  in  the  Outer  Patch  is  developed.  These  scores  are 
accumulated  over  the  Inner  Patch  area  and  this  figure  is  allocated  to  the  sample  address 
corresponding  to  the  centre  of  the  Inner  Patch  store. 

The  scores  so  generated  are  collected  and  may  be  plotted  as  shown  in  Figure  5. 

They  form  a  surface  which  has  a  peak  at  the  measured  position  of  the  target.  This 
position  can  be  expressed  as  an  address  in  the  sampling  matrix. 

TARGET  POSITION  INFORMATION 


This  target  position  is  required  by  the  aiming  and  control  systems  referenced  to  a 
datum.  It  is  often  convenient  for  this  datum  to  be  the  centre  of  the  field  of  view 
(see  Figure  2),  but  it  may  be  a  point  at  another  position  related  to  the  field  of  view. 

The  target  position  is  then  expressed  as  an  x,  y,  co-ordinate  in  picture  samples 
from  this  datum. 

This  position  information  is  available  in  each  field  of  video  with  a  delay,  for 
processing,  from  the  instant  when  the  Outer  Patch  store  is  completely  loaded.  This 
delay  is  fixed  by  the  number  of  calculations  which  have  to  be  carried  out,  which  depends 
on  patch  size  and  search  area.  However  it  can  be  reduced  by  providing  parallel  chains 
of  arithmetic  logic,  so  there  is  a  trade-off  of  delay  and  hardware  complexity  size  and 
power.  Many  servo  systems  require  -  or  at  least  prefer  -  a  fixed  value  for  this  delay, 
so  it  is  often  set  at  the  maximum  for  a  particular  tracker  system,  and  answers  from  the 
smaller  patch  sizes  are  delayed  and  output  at  this  time. 

CHANGES  IN  TARGET  PATTERN 


As  the  target  moves  through  the  background,  or  as  the  aircraft  approaches,  the 
image  of  the  target  will  change.  We  must  provide  an  automatic  means  of  coping  with 
these  changes. 

The  effect  on  the  correlation  system  as  the  image  changes  will  be  to  reduce  the 
correlation  score  (see  Figure  5)<  Because  of  the  area  nature  of  the  process  small 
changes  will  produce  very  small  reductions  in  peak  score.  But  larger  changes  such  as 
change  of  orientation  or  magnification  due  to  approaching  the  target  will  result  in 
significant  reductions  in  peak  score. 


L 


27-3 


These  changes  can  be  detected  by  monitoring  the  peak  score  and  by  setting  a 
threshold.  If  the  peak  score  drops  below  this  threshold  the  machine  triggers  an 
'update'  where  the  contents  of  the  Inner  Patch  store  are  discarded  and  a  new  target 
image  is  loaded  by  taking  data  centred  on  the  latest  target  position.  This  is  then 
used  for  subsequent  fields  as  before. 

The  threshold  is  set  so  that  the  update  is  triggered  before  loss  of  lock  occurs, 
but  sufficiently  low  to  ensure  that  it  does  not  occur  too  often.  This  is  to  reduce 
the  effects  of  the  one  sample  uncertainty  caused  by  the  granularity  of  the  target 
position  measurement  leading  to  a  significant  shift  of  aim  point.  The  effects  which 
lead  to  the  requirement  for  an  update  usually  occur  relatively  slowly  compared  with  the 
field  rate  of  the  imaging  system  so  this  is  not  normally  a  problem. 

EFFECTS  OF  OBSCURATION 


If  the  target  is  obscured  so  that  there  is  no  target  image  the  tracker  cannot 
track.  This  may  occur  through  natural  or  man'  made  objects  in  the  scene  coming  into 
the  line  of  sight  to  the  target  or  through  deliberate  deployment  of  countermeasures 
such  as  smoke. 

The  method  of  overcoming  these  effects  is  to  use  a  track  memory  of  previous  results 
and  predict  the  movement  of  the  target  past  the  obstruction.  Two  and  three  term 
tracking  filters  are  used  for  this  purpose  and  these  can  provide  considerable  protection 
against  these  effects. 

The  predictive  mode  is  triggered  by  examining  the  correlation  surface  peak  score 
and  shape  of  the  peak.  Often  the  reduction  in  score  because  of  an  obstruction  is 
rapid  compared  with  the  signature  change  due  to  orientation  and  so  on.  It  is  also 
usually  much  more  extensive.  These  two  aspects  are  used  to  differentiate  between  the 
two  cases. 

When  obscuration  is  detected  the  tracker  continues  to  position  the  search  area  on 
successive  fields  centred  on  the  predicted  position  of  the  target.  The  update  procedure 
is  inhibited  and  the  correlation  search  is  carried  out  as  usual.  When  the  target  is 
reacquired  tracking  proceeds  normally. 

During  the  obscuration  the  target  position  output  is  giving  the  predicted  position 
to  allow  the  servo  control  to  continue.  However  a  signal  is  available  to  other  systems 
to  indicate  that  this  is  predicted  and  not  tracking  information. 

This  can  provide  information  for  a  short  period.  If  the  obscuration  continues  for 
a  long  period  however  lock  will  be  lost. 

PERFORMANCE 


The  performance  of  these  systems  depends  on  the  image  quality  provided  by  the  imager, 
the  signature  presented  by  the  target  and  the  consistency  of  this  information  as  it  passes 
through  the  field  of  view. 

Video  signal  to  noise  ratio  is  important,  however  the  area  averaging,  which  occurs 
in  the  area  correlation  process,  reduces  the  effect  of  noise  spikes  on  the  correlation 
score.  In  very  severe  noise  the  surface  itself  becomes  noisy  and  it  is  possible  for  a 
false  peak  to  be  detected. 

Spatial  frequency  response  and  grey  level  response  -  related  to  imager  M.T.F.  -  are 
also  important.  The  sampling  process  is  designed  to  use  the  maximum  spatial  frequency 
information  available  and  this  also  sets  the  measurement  grid  for  target  position 
information.  The  peak  of  the  surface  is  detected  to  the  nearest  sample  and  this  sets 
the  accuracy  of  the  target  position  measurement.  The  sharpness  of  the  correlation 
surface  peak  is  a  function  both  of  the  spatial  frequency  content  and  the  grey  level 
content  in  the  target  pattern.  If  the  peak  is  flat  a  smaller  noise  effect  could  cause 
a  shift  of  the  peak  detected. 

It  is  important  that  the  imager  response  to  a  particular  signature  is  consistent, 
at  least  through  the  search  area  of  the  field  of  view  otherwise  the  match  with  the  stored 
pattern  in  the  Inner  Patch  store  will  be  degraded. 

These  then  are  the  effects  which  set  the  performance  of  the  tracker.  Experience 
over  several  years  indicates  that  if  there  is  enough  information  in  the  picture  for  a  man 
to  recognise  the  target  (usually  a  pre-requisite  for  an  engagement)  there  is  enough 
information  to  track  it. 

IMPLEMENTATION 


Figure  6  shows  a  simplified  block  diagram  of  a  typical  Area  Correlator  tracker. 

The  video  from  the  imager  is  sampled  and  digitised  and  then  fed  to  the  two  Patch 
Control  logic  blocks.  The  sampling  is  defined  by  a  master  clock  in  the  tracker  control 
block.  This  clock  is  locked  precisely  to  the  synchronisation  system  of  the  imager  to 
ensure  that  the  same  sample  in  each  field  represents  the  same  position  in  the  field  of 
view. 
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The  Inner  Patch  control  selects  the  video  representing  the  target  as  designated 
either  by  the  operator's  joystick  -  during  initial  designation,  or  from  the  latest 
measured  target  position  when  an  update  is  triggered.  This  patch  of  video  is  then 
stored  in  the  Inner  Patch  store. 

The  Outer  Patch  control  selects  the  video  centred  on  the  target  position  for  the 
Outer  Patch  store. 

The  correlator  then  accesses  the  data  from  both  patch  stores  and  carries  out  the 
correlation  process.  It  then  outputs  the  position  of  best  match  to  the  Target  Position 
Logic  and  Tracker  control.  The  Best  Match  Score  is  also  passed  to  that  block  and  to 
the  Update  Logic  block. 

The  Update  Logic  block  monitors  the  best  match  score  and  triggers  the  update 
process  when  necessary. 

The  Target  Position  Logic  and  Tracker  Control  block  processes  the  target  position 
information  to  refer  it  to  the  required  datum,  carries  out  the  tracking  filter 
calculations  providing  predicted  target  positions  if  necessary,  and  interfacing  with 
operator  controls  and  imager  synchronising  circuits.  It  also  contains  the  master  clock. 

The  circuit  and  logic  are  normally  based  on  TTL  integrated  circuits  and  a  micro¬ 
processor  is  used  as  a  controller  and  to  carry  out  the  slower  calculations.  The 
correlator  itself  has  to  run  at  video  rates  and  is  normally  constructed  in  hardware 
often  with  several  parallel  chains  to  reduce  the  delay  times. 

The  shape  and  size  of  equipment  depend  to  a  large  extent  on  the  facilities  required 
and  the  environment  the  tracker  is  to  be  used  in.  The  photograph  (Figure  7)  shows  a 
tracker,  control  unit  and  display  monitor  for  a  commercial  environment.  For  an  air¬ 
worthy  equipment  for  a  military  application  the  tracker  could  be  reduced  to  |  ATR  short 
format . 

MAM-MACHINE  INTERFACE 


The  tracker  removes  the  on-line  tracking  task  from  the  operator.  The  requirements 
for  control  are  for  designation  of  the  target,  selection  of  tracking  box  size  and  switch 
to  automatic  track.  Facilities  are  also  available  to  adjust  the  tracking  point  and  to 
adjust  the  box  size  during  the  automatic  phase  if  required.  This  can  be  done  without 
the  machine  losing  lock. 

The  markers  shown  here  may  not  be  ideal  and  more  work  is  required  on  the 
optimisation  of  both  markers  and  joystick  control  laws  for  use  in  the  cockpit 
of  manned  interdictor  aircraft. 

DEMONSTRATION  FILM 

A  short  film  will  be  shown  illustrating  an  EMI  Electronics  Ltd.  tracker  operating 
with  a  TV  camera  fitted  to  a  two  axis  servo  controlled  mount  designed  and  built  by 
Evershed  Power  Optics  Ltd.  Chertsey,  England. 

CONCLUSION 


The  tracking  system  described  based  on  the  area  correlator  technique  can  provide 
a  stable  and  accurate  track  of  targets  for  use  in  airborne  systems  in  conjunction  with 
FLIR  or  TV  imagers.  It  is  of  small  size  and  can  be  built  into  standard  format  packages 
for  installation  in  military  aircraft. 

The  main  advantages  and  disadvantages  of  this  technique  may  be  summarised  as  follows 


Advantages 


1.  It  can  handle  targets  with  a  wide  range  of  characteristics 

2.  It  can  track  background  features  to  provide  a  ground  reference 

3.  It  does  not  require  that  the  edges  of  the  target  are  within  the 
tracking  patches.  It  can  handle  targets  which  completely  fill 
the  field  of  view. 

4.  It  can  adapt  automatically  to  magnification  and  target  aspect 
changes . 

5.  It  provides  highly  tenacious  tracking  in  low  signal  to  noise 
conditions. 


Disadvantages 


A  very  accurate  designation  is  required  if  it  is  important  to 
track  a  specific  point  on  the  target. 


2. 


The  target  image  must  present  a  reasonable  area  within  the  field 
of  view  of  the  imager. 
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Applications  for  these  systems  include  automatic  control  of  laser  illuminators  for 
use  with  laser  seeker  semi-active  weapons;  control  of  imager  sightline  to  provide  stable 
pictures  to  aid  recognition  by  locking  onto  the  target  or  nearby  background;  and  the 
heart  of  a  passive  homing  missile  system. 
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Summary 

Image  processing  techniques,  quite  generally,  amount  to  a  signal  processing  problem. 
Target  tracking  is  a  special  case  of  such  a  signal  processing  procedure.  Essentially 
based  on  the  distinction  between  two  processes  (target  and  scenery),  tracking  is  broken 
down  into  three  parts;  signal  acquisition  and  reduction,  signal  classification  with 
respect  to  target  and  scenery,  and  determination  of  the  target  coordinates  with  the  aid 
of  the  classified  signals. 

It  will  be  shown,  that  the  Bayes'  decision  rule  works  optimally  in  a  statistical  sense. 
The  target  coordinates  are  determined  by  calculating  the  center  of  gravity  using  the 
probability  of  presence  of  the  object  pixels.  In  addition  the  target  coordinates  are 
filtered  by  means  of  a  Kalmanfilter .  This  complete  procedure  is  recursive  and  exhibits 
an  adaptive,  learning  behavior.  Similar  algorithms  were  used  in  other  applications  and 
are  known  as  unsupervised  learning  procedures. 


Introduction 


This  paper  serves  to  provide  a  brief  summary  on  a  special  procedure  used  in  image 
processing,  known  as  tracking. 

Prior  to  describing  this  particular  procedure,  however,  I  should  like  to  give  some 
general  explanations  on  image  processing. 

As  shown  in  Figure  1  the  brightness  (and  possibly  the  color)  measured,  for  instance, 
in  a  square  of  256  x  256  pixels,  and  the  position  in  the  measuring  matrix  are  the  only 
data  provided  by  the  sensor.  In  addition  one  knows,  however,  and  this  is  important  for 
further  processing,  that  the  terrain  is  static,  that  the  object  is  compact  and  fluid, 
i.e.  the  relationship  between  the  pixels  is  not  defined  and  that  the  target  obeys  a 
moving  model. 

Hence,  tracking  can  be  broken  down  into  two  sections: 

-  signal  acquisition  and 

-  signal  processing, 

whereby  additional  data,  which  are  not  measurable  but  can  be  derived  from  physical  laws 
and  are  largely  determined  by  the  problem  to  be  solved  play  a  pivotal  role  in  signal 
processing. 
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But  before  we  deal  with  the  derivation  of  the  theory  the  function  of  the  tracker,  based 
on  Bayes'  Theory,  shall  be  demonstrated  on  a  simple  block  diagram. 

The  principle  functions  of  a  tracker  can  be  broken  down  into  4  blocks. 

signal  acquisition  and  conditioning 
-  signal  processing  or  classification 

determination  of  the  target  coordinates  according  to  the  classification  results 
filtering  and  prediction  of  the  hit  coordinates  based  on  the  moving  model  of  the 
target 

These  functions  are  summarized  in  the  following  simplified  block  diagram  (Fig.  2),  and 
can  also  be  designated  as  a  predicting  algorithm  for  target  coordinates. 

As  shown  in  figure  2  the  algorithm  is  recursive.  The  a  posteriori  information  obtained 
from  the  classif icator  is  used  as  a  new,  a  priori  information.  Moreover,  Fig.  2  reveals 
that  the  signal  acquisition  is  controlled  by  the  classificator  via  the  switching 
function  u. 

Remarkable  analogies  to  control  problems  appear.  For  instance,  the  a  priori  and  a 
posteriori  information  v  and  q,  and  the  inputs  to  the  classificator  f^,  f^  can  be 
interpreted  as  state  variables.  Therefore  the  availability  or  estimation  of  definite 
initial  conditions  is  necessary  for  initialization.  This  problem  is  well  known  in  optimal 
filtering.  Thanks  to  its  optimization  and  based  on  its  a  priori  information,  the  classi¬ 
ficator  correctly  classifies  and  improves  the  measured  signals  with  max.  probability. 
Optimization  is  thus  accomplished  in  a  statistical  3ense. 

The  recurrence  leads  to  the  desired  learning  effect,  i.e.  the  estimated  position  x,  y 
tends  to  the  correct  value  as  the  number  of  signals  measured  increases.  This  adaptive 
behavior  is  also  characteristic  of  control  problems.  In  contrast  to  most  control 
procedures  the  algorithm  used  is  however,  extremly  nonlinear  and  time-variant. 

The  brightness  of  the  pixel  constitutes  the  only  input  or  feature  in  the  tracker.  The 
interrelation  between  the  pixel  positions  are  not  taken  into  account,  so  that  each  pixel 
can  be  processed  independently  of  the  neighbouring  pixel.  This  leads  to  a  considerable 
simplification  of  the  classificator. 

Therefore,  the  processing  can  be  easily  extended  to  all  pixels,  i.e.  background  infor¬ 
mation  can  be  included  which  accounts  for  the  major  part  of  the  signals  measured.  For 
this  purpose,  it  is  imperative  that  the  background  is  position  independent  and  that 
it  is  mapped  accordingly  in  the  measuring  matrix.  This  signifies  that  the  position  of 
the  sensor  must  be  fixed  or  stabilized  according  to  a  reference,  as  otherwise,  the  a 
priori  information  to  the  effect  that  the  background  is  static  can  not  be  utilized. 
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Bays  Theory, 

After  these  preliminary  remarks  we  will  now  explain  the  Bayes'  Theory.  To  solve  the 
tracking  problem,  we  have  to  decide  whether  a  pixel  belongs  to  the  target  or  the 
scenery.  Supposing  that  two  competing  processes  exist  the  Bayes'  Theory  decides  to  which 
process  a  pixel  belongs.  And  this  is  exactly  what  tracking  is  aimed  at: 
namely  to  determine  whether  this  pixel  belongs  to  the  target  or  not. 

The  Bayes  Theory  solves  this  problem  providing  an  optimal  decision  with  respect  to  the 
cost  function,  whereby  the  mean  cost  risk  is  minimized  and  the  costs  are  equally  dis¬ 
tributed  to  all  possible  error  decisions.  (Meyer-BrOtz,  SchUrmann  1970) 

This  rule  can  be  formulated  as  follows: 

if  Pjfj  >  Pjfj  3^5  (i) 

is  true,  the  process  was  measured. 

This  decision  implies,  that  process  3  has  been  measured. 

The  definitions  of  this  rule  are  as  follows: 


Pj  «  "a  priori"  probability  that  process  j  exists 

f j  =  the  density  of  x,  whereby  x  is  the  measured  brightness  of  the  process  j 

x  ”  measured  signal  of  a  pixel  and  standardized  brightness  in  the  range  of  0Ax*1 

j  =  nested  index  for  all  possible  processes  1  =  j  =  n 

*  =  number  of  the  process,  which  was  selected  according  to  the  classification  rule 

n  «  number  of  possible  processes 

Equation  1  may  also  serve  to  calculate  the  a  posteriori  probabilities  for  measuring 
process  i. 

This  probability  to  which  the  following  formula  applies 


-  P.f . 

i  l 


is  designated  w^. 

In  addition  the  equation  applying  to  all  probabilities 


is 

and  therefore 


£  wi  = 


p.f, 

m  i  i 

#i  =.ipifi 

hi 


(2) 


For  example  the  following  decision  is  obtained  by  equation  1  for  two  processes,  n=2 


A 

j  «  1  for  w^Wj 
j  =2  for  w^Wj 


The  calculation  and  meaning  of  p^  and  are  unknown  and  shall  be  explained  in  the 
following: 
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Introduction  of  the  terms  'visibility  probability1  and  'probability  of  presence. 1 

The  density  function  of  the  terrain  f.,,  as  a  function  of  the  position,  as  the  terrain 
does  not  move  and  thus  the  dependence  on  the  position  contains  important  information. 
This  does  not  apply  to  the  density  of  the  target  fj.  It  is  calculated  as  a  histogram 
which  is  position  independent  as  the  target  is  normally  moving  and  changing  its  shape 


Representation  of  the  density  functions 


In  order  to  be  able  to  calculate  the  density  functions,  the  structure  of  this  function 
must  be  assumed.  A  Gaussian  distribution  may  be  used  for  the  density  f2  (terrain).  The  un¬ 
known  parameters  r  and  6  must  be  estimated  or  learned. 

’  e-«ff 


f  2  " 


6  fix 


d  =  x-r 


The  equations  for  r,  6  are  as  follows: 

r (k+1 )  -  r (k)  +  ^  (x(k)  -  r(k))-u  (4) 

s  (k+1 )  =  s  (k)  +  f&  ( lx(k)  -  r(k)i  -  s(k))-u  (5) 


x  =  actual  picture  *  gain  <-1 

r  =  reference  picture 
6  =  variance  picture 
k  =  time 


Similar  to  equation  4/5  an  estimation  can  also  be  set  up  for  f 1 .  This  is  necessary, 
because  f1  may  be  subject  to  change. 

f  1  (k+1 )  =  f  1  (k)  +  f(f1(k)  -  f  1  (k)  )•  u  (6) 

u  =  1  if  the  pixel  is  classified  as  a  terrain  point,  for  the  time  interval  k 

u  =  0  if  the  pixel  is  classified  as  a  target  point,  using  time  interval  k. 


Target  coordinates. 

As  already  mentioned,  q  consists  of  different  probabilities,  whereby  one  probability, 
referred  to  as  probability  of  presence  is  used  to  calculate  the  target  coordinates. 
Although  there  are  several  ways  of  determining  these  coordinates  we  have  chosen  the 
center  of  gravity  of  the  probability  of  presence  to  represent  the  target  point. 

Based  on  the  coordinates  of  all  the  pixels  classified  as  target  points,  the  center  of 
gravity  can  be  obtained  by  calculating  the  0  and  1.  order  moments.  To  begin  with,  a 
target  image  Z  (i»j)  is  created,  using  the  probability  of  presence  q1  and  the  choosable 
threshold  value  q  . 

rO  for  q. (i, j) k  q„ 

zU,j)  -  |  ,  for  qi)  (if  j) 
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Using  a  track  algorithm  the  process  'terrain'  can  be  further  broken  down  into  foreground 
and  background.  An  object  may  be  present  behind  a  foreground  without  being  visible.  A 
probability  can  be  formulated  for  such  hidden  objects,  and  is  referred  to  as  probability 
of  presence.  This  probability  includes  important  information,  since  coordinates  can  be 
defined  also  for  hidden  objects. 

However  the  foreground  cannot  be  recognized  unless  the  object  disappears  behind  it,  i.e., 
the  foreground  becomes  visible  or  reconnoitred  only,  when  the  object  is  present.  In  this 
case  only  an  insignificant  a  posteriori  visibility  probability  (VP)  w^  can  be  calculated 
despite  a  high  a  priori  VP  (measuring  probability  P-| ) •  The  reason  being,  that  brightness 
x  does  not  belong  to  the  object  density  f^  but  to  the  density  of  the  scenery  fj. 

Since  an  object  cannot  disappear  from  the  scenery  a  certain  probability  can  be  formulated 
for  the  presence  of  the  foreground. 

To  explain  the  mathematical  derivations,  the  definitions  determined  for  the  single 
probabilities  are  as  follows: 


Definitions  of  the  necessary  probabilities. 

pQ  a  priori  visibility  probability  for  the  foreground 
P1  a  "  “  "  for  the  object 

Pj  a  "  "  "  for  the  terrain  (fore-  and  background) 

wq  a  posteriori  visibility  probability  for  the  foreground 
w1  a  "  "  ”  for  the  object 

w2  a  "  "  for  the  terrain 

vq  a  priori  probability  of  presence  for  the  foreground 
v1  a  "  "  "  "  for  the  object 

v2  a  "  "  "  "  for  the  terrain 

qQ  a  posteriori  probability  of  presence  for  the  foreground 
q1  a  "  "  "  "  for  the  object 

q2  a  "  for  the  terrain 

Although  some  of  the  values  defined  above  are  nt,  V  required  they  have  been  indicated  for 
the  sake  of  completeness. 

Based  on  these  relations  the  following  equations  can  be  established: 


P1 

“  V’-V 

(3a) 

P2 

=  !-p1 

(3b) 

W1 

P1f1  /(P1f1  +  P2f2’ 

(3c) 

^0 

-  V0f2  '‘Vi  +  P2f2> 

(3d) 

*1 

=  wi  +  Vo 

( 3e) 
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The  next  chapter  demonstrates  how  £ 1  and  f  2  can  be  calculated  using  the  measured  bright¬ 
ness  of  the  pixels. 

Provided  that  vQ,  v1 ,  f1  and  f2  are  Known,  all  possible  probabilities  can  be  calcu¬ 
lated. 


Representation  of  the  brightness  as  a  density  function. 

As  shown  in  the  preceding  paragraphs  the  density  function  f1  (object/target)  and  f2 
(scenery)  must  be  measured  for  classification  purposes. 

This  can  be  achieved  in  several  ways : 

a)  the  density  function  is  position  dependent 

b)  the  density  function  is  position  independent. 

The  density  functions  themselves,  as  a  function  of  the  position  or  not,  can  be  determined 
parametrically  or  by  counting  (histogram) .  Three  aspects  play  an  essential  role  in  deter¬ 
mining  the  densities: 

a)  data  reduction 

b)  exact  description  of  the  density 

c)  simple  hardware  equipment 

It  is  expedient  to  determine  the  sum  of  the  rows  and  columns  can  then  be  calculated  as 
follows: 

ZS(i)  »  )  Z  (i,  j) 

<•« 

ss(j)  =  y  zu, j) 


imax,  jmax  height  and  width  of  the  target  window: 

As  a  result,  the  zero  and  first  order  moments  in  i  and  j  direction  read  as  follows: 


|w»ir 


Moi  =  Moi  -  =  ?  ss(j) 


4 


SS  ( j ) 


i 

Mii =  lZ  j  • 

M1i  =  HZ  zs^ 


(8a) 

(8b> 

(8c) 


And  the  center  of  gravity  of  the  object  is: 


Si 


lli 

40i 


(9a) 


Si 


!Lsi 

M0j 


(9b) 
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The  filter  shown  in  Fig.  1  is  not  only  used  to  smooth  the  target  coordinates,  but  also 
serves  to  predict  the  future  position  of  the  target  by  means  of  a  moving  model.  These 
data  which  cannot  be  obtained  by  the  Bayes'  Theorem  are  required  for  the  calculation  of 
the  a  priori  probability  with  the  help  of  the  a  posteriori  probability. 


Calculation  of  new  a  priori  probabilities  on  the  basis  of  the  previously  calculated  a 
posteriori  probabilities. 

Since  learning  procedures  work  in  an  adaptive  way,  the  a  posteriori  information  will  react 
upon  the  a  priori  information  of  the  following  time  step. 

This  leads  to  the  following  simple  relationship: 

VQ(i,  j,k+1)  =  qQ(i,j,k)  (10) 

k:  time 

i,j:  position  coordinates 

Equation  10  presupposes  that  the  process  is  not  a  function  of  the  position  and  should 
therefore  be  applied  with  precaution. 

As  far  as  the  foreground  and  thus  fj  and  q^  are  concerned,  no  difficulties  exist  as  the 
foreground  is  static  and  does  not  change  in  its  shape. 

This  is  different  as  far  as  the  target  is  concerned,  which  moves  and  also  changes  its 
shape  and  size  and,  as  a  consequence  calls  for  a  prediction  of  the  position  and  of  the 
change  in  shape.  This  problem  can  be  solved  in  several  ways.  Provided,  that  the  target 
position  can  be  predicted  exactly  with  a  Kalmanfilter  and  that  the  change  in  shape 
is  slow,  the  following  equation  can  be  established: 

v i ( i , j , k+ 1 )  =  q1 (i-  di, j- A j,k)  (11) 

whereby  di  and  dj  constitute  the  difference  between  the  predicted  center  of  gravity 

A 

S  and  the  measured  one  S. 

Ai  =  S.  -  Si  (12) 


A  A* 


Realization. 


To  conclude  with,  some  remarks  shall  be  made  with  respect  to  the  realization.  An 
essential  advantage  of  the  Bayes'  Theorem  is,  that  the  pixels  can  be  classified  indepen¬ 
dently  of  one  another.  The  signals  measured,  can  be  classified  immediately  in  the  same 
order,  as  they  are  emitted  by  the  sensor,  without  the  necessity  of  a  complete  image. 

Since  the  information  of  the  image  is  evaluated  in  its  completeness,  and  therefore  both 
the  probabilities  of  the  target,  and  of  the  terrain  are  known,  disturbances  in  the  scenery 
are  recognized  and  misclassifications  minimized. 
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No  realization  problems  occur  if  a  clock  of  5  MHz  is  used.  The  mathematical  algorithm 
must  be  realized  by  means  of  pipeline  techniques  using  a  clock  of  200  ns.  A  word  length 
of  8  bits  is  sufficient  and  an  automatic  rescaling  is  necessary  for  the  division  in 
equation  2  only.  The  complete  hardware  is  rather  costly,  as  a  large  quick  memory  and 
expensive  IC's  are  involved. 
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SUMMARY 


This  paper  describes  procedures  for  the  detection  of  moving  or  stationary  objects  in  single 
images  and  in  image  sequences,  taken  from  a  moving  or  stationary  sensor.  Primarily  the  ob¬ 
jects  are  located  on  the  ground  in  their  natural  environment  so  that  simple  detection  pro¬ 
cedures  (e.g.  detection  by  maximum  intensity)  cannot  be  applied. 

For  those  situations  a  tracking  system  has  been  designed,  simulated  on  a  digital  computer 
and  tested  with  an  equipment  for  real  time  application:  At  the  beginning  the  object  is  de¬ 
tected  by  an  human  operator.  From  that  moment  it  will  be  tracked  automatically  based  on  an 
evaluation  of  the  correlation  function  between  the  actual  scene  and  a  memory  which  is  con¬ 
tinuously  updated.  The  scale  (distance  between  object  and  sensor)  is  determined  separate¬ 
ly  and  normalized  by  the  location  of  object  parts.  The  basic  shortcomings  of  a  correlation 
tracker  system  (e.g.  image  modification  by  foreground  and  background  objects)  have  been 
eliminated  in  a  system  where  in  addition  to  the  evaluation  of  the  correlation  function 
objects  in  the  foreground  or  background  are  detected  based  on  features  like  contrast,  image 
differences,  contour  lines,  shape  and  relative  speed  of  the  objects. 

To  substitute  the  operator  at  the  beginning  of  the  tracking  an  automatic  detection  and 
classification  approach  for  the  evaluation  of  single  images  has  been  simulated  within  a 
cooperative  research  project  conducted  by  several  NATO  countries.  For  image  areas  obtained 
by  a  multiple  threshold  binarisation  geometric  features  are  computed  and  evaluated  with  a 
linear  classification  system  showing  detection  rates  of  approximately  90*  and  classifica¬ 
tion  rates  of  50  -  60*. 

1 .  INTRODUCTION 

Object  detection  in  single  images  and  in  image  sequences  is  an  important  task  for  many 
applications  (e.g.  reconnaissance,  battlefield  surveillance,  tracking  of  moving  objects, 
homing  systems) .  So  far  the  object  detection  in  those  systems  has  been  carried  out  by  man 
(e.g.  photo-interpreter,  gunner) .  To  reduce  the  time  for  the  data  Interpretation  or  to 
substitute  the  man  in  dangerous  situations  automatic  systems  are  needed.  In  this  paper 
procedures  for  object  detection  in  image  sequences  (correlation  tracker)  and  in  single 
images  (multi-threshold  object  detection  in  thermal  imagery)  will  be  described. 

2.  CORRELATION  TRACKER 

Fig.  1  shows  the  basic  components  of  a  system  for  detecting  and  tracking  objects  in  image 
sequences.  The  actual  scene  is  continuously  monitored  by  a  sensor  (e.g.  TV-camera,  IR-ca- 
mera)  and  the  sensor  signals  are  processed  on-line  in  an  image  processing  unit  where  an 
operator  can  influence  the  processing  by  additional  information  about  the  object  and  the 
environment.  The  result  of  the  processing  is  the  position  of  a  selected  object  in  each 
image  of  the  sequence  and  this  position  is  used  to  control  the  sensor  and  its  platform 
and  to  start  system  reactions  due  to  the  position  of  the  object.  The  objects  are  stationary 
(e.g.  tower,  bridge,  road)  or  moving  (e.g.  vehicle  on  the  ground,  aircraft,  ship). 

The  platform  with  the  sensor  system  can  also  be  stationary  (e.g.  fixed  on  the  ground  or  on 
a  tower)  or  moving  (e.g.  mounted  in  a  vehicle  on  the  ground  or  in  an  aircraft).  In  the 
following  paragraphs  the  image  processing  unit  is  described  primarily  for  the  tracking  of 
objects  on  the  ground  because  those  situations  are  rather  difficult  to  handle  due  to  fore¬ 
ground  and  background  problems.  Detailed  information  is  given  in  (BOHNER,  M.,  1976), 

(  KAZMIERCZAK,  H.,  1978),  (BACH,  S.,  1978). 

2.1.  Basic  System 

In  the  past  many  systems  have  been  built,  where  the  image  processing  was  based  on  a  simple 
evaluation  of  the  intensity  values  of  the  object  and  its  environment.  After  preprocessing 
of  the  sensor  signals  (e.g.  windowing,  weighting,  linear  or  non-linear  twodimensional 
transformation)  the  maximum  intensity  levels  or  other  similar  features  like  centre  of  gra¬ 
vity  are  used  to  detect  the  position  of  a  selected  object.  Those  systems  work  only  success¬ 
fully  when  a  high  contrast  between  object  and  background  is  given  (e.g.  aircraft  in  the 
sky,  hot  vehicles  in  a  cold  environment).  Objects  on  the  ground  are  often  lost  especially 
in  the  European  landscape.  Better  results  can  be  expected  by  correlation  tracking. 

Fig.  2  shows  the  principle  mode  of  operation  of  a  system  based  on  the  correlation  function. 
An  operator  detects  an  interesting  object  (the  car)  in  the  moment  t(n),  this  object  is 
stored  in  a  memory  as  a  reference  and  in  the  next  image  the  correlation  function  between 
the  actual  scene  at  t(n+1)  and  the  reference  memory  is  computed.  The  reference  image  has 
to  be  shifted  within  a  search  area  which  results  in  a  field  of  correlation  values  K(x,y). 
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These  values  are  used  to  detect  the  object  in  the  actual  scene  by  a  decision  criterion 
and  to  update  the  reference  memory  for  changing  representations  of  the  object. 

Correlation  function:  As  measure  of  similarity  between  the  object  o  in  the  actual  scene 
and  the  reference  r  the  correlation  function  in  its  normalized  version  was  selected. 


£(o.-o)  (r,  -r) 

K(x,y)  =  -1—  ■ 

/^(Oi-o)j  ( r±— r )  ’ 


o.,r.  picture  elements  of 
object  and  reference 

o,  r  mean  values  of  o^  and  r^ 


The  main  benefit  of  this  measure  is  its  limited  range  of  values  ( K  =  1 :  r  and  o  are  equal, 
K  =  0:  r  and  o  are  statistically  independent)  and  its  independence  to  a  linear  transfor¬ 
mation  of  the  intensity  levels  (e.g.  caused  by  changes  in  the  illumination  of  the  object). 


System  parameters:  Parameters  like  number  of  grey-levels,  resolution,  size  of  reference 
memory,  correlation  and  update  rate  have  to  be  adapted  very  carefully  to  the  special  pro¬ 
blem.  As  an  example  the  influence  of  the  memory  size  to  the  twodimensional  correlation 
field  is  shown  in  Fig.  3.  For  a  memory  size  too  small  for  a  special  type  of  objects  the 
autocorrelation  has  a  sharp  peak  at  the  location  of  the  object,  but  after  a  small  varia¬ 
tion  of  the  object  a  detection  is  no  longer  possible  (Fig.  3a) .  With  the  proper  size  of 
the  memory  the  position  of  the  object  can  be  found  in  both  cases  (Fig.  3b) . 


Decision  criterion:  The  position  of  the  maximum  value  of  the  correlation  function  within 
the  search  area  was  selected  as  position  of  the  object. 

Update  criterion:  Two  different  criterions  have  been  tested. 

-  The  memory  is  updated  as  long  as  the  maximum  of  the  correlation  exceeds  a  threshold, 
that  means  as  long  as  a  good  similarity  between  reference  and  actual  scene  is  given. 

-  The  memory  is  updated  when  the  maximum  of  the  correlation  is  below  a  threshold, 
that  means  the  memory  is  only  updated  when  the  appearance  of  the  object  has  changed 
after  a  period  of  time. 

The  simulation  showed  that  for  such  a  simple  correlation  tracker  the  success  of  both  cri¬ 
terions  depends  on  the  actual  scene,  no  general  preference  of  one  criterion  could  be 
found. 


Such  a  simple  correlation  tracker  system  has  been  simulated  with  a  digital  computer  using 
sequences  of  real  images  in  the  visual  range  of  the  spectrum  and  has  been  developed  with 
special  hardware  for  real  time  application  with  a  TV-  or  IR-camera  as  sensor. 

2.2.  Object  Approach 

When  approaching  an  object  the  scale  of  the  object  representation  is  constantly  increased. 
Because  of  the  sensitivity  of  the  correlation  function  to  changes  in  scale,  the  object 
size  in  the  image  has  to  be  reduced  to  an  original  value.  The  amount  of  this  reduction 
can  be  determined  by  drift-references. 

The  drift-references  represent  parts  of  the  object  to  be  tracked  and  can  be  extracted 
from  the  first  reference  as  conspicuous  subareas  (WINKLER,  G. ,  1978)  within  the  field  of 
the  total  reference.  In  a  special  realization  the  original  reference  has  been  subdivided 
systematically  into  drift-references  (Fig.  4a) .  After  a  period  of  time  the  location  of 
each  drift-reference  is  determined  independently  (Fig.  4c).  From  the  location  of  all  drilt- 
references  a  value  for  the  reduction  of  the  actual  image  size  can  be  derived.  This  system 
has  been  simulated  successfully  using  test  scenes  as  shown  in  Fig.  5. 

2.3.  System  Modifications 

The  software  simulation  as  well  as  the  on-line  hardware  system  revealed  the  basic  problem 
of  the  method,  demonstrated  in  Fig.  6  in  principle.  In  the  moment  t (n)  the  reference  me¬ 
mory  contains  the  object  without  any  foreground  objects,  whereas  in  the  actual  scene  the 
car  is  partially  hidden  by  the  trunk  of  a  tree.  The  maximum  of  the  correlation  function 
decreases  but  the  object  position  can  still  be  found  with  a  high  probability.  If  the  up¬ 
date  criterion  is  active  in  this  moment,  the  trunk  is  learned  into  the  memory  as  a  part 
of  the  object.  Therefore  in  the  moment  t(n+1)  two  peaks  in  the  field  of  the  correlation 
values  are  obtained  (matches  of  car  and  trunk) .  Dependent  on  the  size  and  contrast  of  the 
trunk  a  wrong  object  position  may  be  found.  A  similar  problem  arises  by  big  contrast  changes 
in  the  background  of  the  object.  To  overcome  these  problems  several  modifications  have 
been  simulated  and  evaluated. 


In  an  environment  with  heavy  contrast  changes  in  the  background  a  modified  update  proce¬ 
dure  should  be  preferred.  A  new  reference  is  composed  of  a  part  of  the  old  reference  and 
the  actual  image  data.  Because  of  the  relative  speed  between  a  moving  object  and  the  sta¬ 
tionary  background,  only  the  moving  object  is  well  represented,  whereas  background  infor¬ 
mation  is  blurred.  However  with  this  update  procedure  new  problems  may  arise  when  the  re¬ 
presentation  of  the  object  is  changing  very  quickly. 

In  many  situations  where  an  object  has  been  lost  based  on  a  detection  by  the  absolute  maxi¬ 
mum  of  the  correlation  function  a  local  maximum  was  found  at  the  position  of  the  object. 
Therefore  the  decision  criterion  was  modified  in  such  a  way  that  all  local  maxima  within 
the  search  area  are  extracted  and  weighted  by  a  factor  depending  on  the  distance  of  the 
position  predicted  by  an  extrapolation  of  the  object  movement  in  the  past.  After  the  weight- 


ing  the  original  decision  criterion  can  be  used. 

The  prediction  of  the  object  movement  can  be  controlled  by  a  background  correlation  (Fig.  7) . 
Image  information  in  front  of  the  moving  object  is  stored  in  a  background  reference  and 
this  reference  is  correlated  at  a  fixed  position  in  the  scene  over  a  period  of  time.  The 
correlation  value  at  this  position  decreases  when  the  moving  object  is  crossing  the  back¬ 
ground  and  increases  approximately  to  the  original  value  K  =  1  when  the  object  has  passed. 
With  the  position  x3  of  the  background  reference  and  moment  t3  of  the  correlation  minimum 
the  prediction  of  the  object  movement  can  be  corrected. 

In  another  modification  the  object  reference  is  subdivided  systematically  into  subreferen¬ 
ces  and  the  correlation  values  are  computed  for  the  total  reference,  a  number  of  subrefe¬ 
rences  and  some  combinations  of  subreferences.  The  correlation  values  of  the  subreferen¬ 
ces  are  very  sensitive  to  small  changes  in  the  field  of  view  of  the  object  because  of  the 
small  size  of  the  corresponding  references  (Fig.  3)  .  This  feature  can  be  used  in  a  system 
demonstrated  in  Fig.  8  for  segmentation  into  6  subreferences  and  correlation  with  the  to¬ 
tal  reference  and  subreferences  4,  5  and  6:  The  moving  object  is  detected  in  each  image 
of  the  sequence  at  the  position  of  the  maximum  correlation  value  of  the  total  reference. 

Small  foreground  objects  however  are  detected  by  a  rapid  decrease  of  the  correlation  va¬ 
lues  of  subreferences  and  in  the  corresponding  areas  the  reference  is  not  updated.  There¬ 
fore  the  moving  object  can  pass  behind  a  foreground  object  without  influence  to  the  re¬ 
ference  (Fig.  8b  and  c) . 

2.4.  Correlation  and  Object  Detection 

With  the  modifications  of  the  simple  correlation  tracker  some  problems  (sensor  noise,  fore¬ 
ground  objects,  contrast  changes  in  the  background)  are  less  critical,  but  the  basic  ori¬ 
gin  of  these  problems  is  not  done  away.  Moreover  the  simulation  showed  new  problems  which 
has  come  up  with  the  modifications  (e.g.  quick  changes  of  the  object  representation) .  To 
overcome  this  situation  a  new  approach  has  been  designed  combining  correlation  and  object 
detection  methods.  The  object  to  be  tracked  is  mainly  located  by  the  correlation  of  the 
actual  scene  and  a  reference.  However  the  evaluation  of  the  correlation  values  and  especial¬ 
ly  the  decision  for  the  final  object  position  is  highly  influenced  by  object  detection 
methods  where  parts  of  the  moving  object  and  mainly  foreground  and  background  objects  are 
detected  and  identified  within  a  period  of  time.  The  detection  and  identification  of  those 
objects  or  parts  of  objects  are  based  on  features  like 

-  intensity  and  contrast  levels 

-  image  differences  of  "adjacent"  images  within  a  sequence 

-  distribution  of  contour  lines 

-  shape  of  special  areas  near  the  moving  object 

-  relative  speed  between  objects. 

This  approach  has  been  simulated  with  a  software  system  where  the  extracted  foreground  and 
background  objects  are  taken  into  account  by  correlation  and  update  templates  (Fig.  9) : 

-  Monitored  by  an  update  template  foreground  information  is  not  learned  into 
the  reference  memory 

-  Monitored  by  a  correlation  template  the  correlation  values  are  only  computed 
for  picture  elements  which  represent  the  object  with  a  high  probability,  other 
picture  elements  are  neglected. 

f-  The  simulation  with  test  scenes  in  the  visual  range  (examples  in  Fig.  10)  has  been  very 

successful,  a  real-time  hardware  system  is  under  development. 

3.  MULTI-THRESHOLD  OBJECT  DETECTION  IN  THERMAL  IMAGERY 

All  approaches  mentioned  above  need  a  man  for  the  detection,  classification  and  selection 
of  the  target  to  be  tracked  at  the  beginning  of  the  sequence.  Furthermore  a  target  lost  in 
a  special  situation  within  the  sequence  normally  cannot  be  reacquired  automatically  and 
new  targets  entering  the  field  of  view  in  a  later  stage  are  not  detected.  To  overcome  these 
shortcomings  objects  have  to  be  detected  and  classified  automatically  in  single  images. 

Under  Panel  III  of  the  NATO  Defence  Research  Group  the  Research  Study  Group  9  (NATO  AC/243 
(Panel  IIIJ/RSG.9)  is  active  in  conducting  cooperative  research  projects  in  image  process¬ 
ing  within  the  participating  countries  Canada,  Denmark,  France,  Fed.  Rep.  Germany,  The 
Netherlands,  Norway  and  USA.  Their  first  cooperative  research  project  had  the  final  goal 
of  discriminating  and  classifying  operational  military  targets  in  natural  scenes  from 
thermal  imagery.  Reaching  this  goal  the  above  mentioned  problems  are  solved. 

In  conducting  this  project  the  participiting  countries  agreed  on  a  common  data  base  to 
which  all  algorithms  developed  in  the  different  countries  have  been  applied.  This  project 
has  been  terminated  by  a  mutual  evaluation  of  t'-.e  various  methods  in  respect  to  segmenta¬ 
tion,  detection  and  classification  performance  considering  the  requirements  for  real¬ 
time  hardware  systems.  A  final  report  is  under  preparation  (SEVIGNY,  L. ,  1980). 

In  the  following  paragraphs  the  German  approach  within  this  cooperative  project  is  dis¬ 
cussed  in  principle,  a  detailed  description  is  given  in  (EBERT,  A.,  1980). 


I 


31-4 


3.1.  Data  base 

The  Alabama  data  base  was  selected  as  common  data  base.  This  data  base  consists  of  43  Ima¬ 
ges  in  the  thermal  range  of  the  spectrum  (3-5  um  and  8-14  urn)  containing  84  vehicles  as 
40  tanks 

29  armoured  personnel  carriers  (APC) 

15  jeeps 

and  other  foreground  and  background  objects.  Typical  examples  of  the  imagery  are  shown  in 
Fig.  11.  The  vehicles  are  always  brighter  than  their  local  environment,  but  partially 
with  a  very  low  local  contrast.  Some  vehicles  are  merged  to  a  single  blob  with  nearly  no 
contrast  to  each  other.  The  size  of  the  targets  (target  area)  varies  from  10  to  500  picture 
elements  with  a  grey-level  resolution  of  approximately  8  bit  in  the  total  imagery. 

For  the  cooperative  research  project  the  terms  detection  and  classification  has  been  de¬ 
fined  in  the  following  way: 

detection:  discrimination  between  targets  and  background.  The 
targets  are  not  split  into  different  types. 

classification:  discrimination  of  detected  targets  into  the  three 
classes  tanks,  APC  and  jeep. 


3.2.  Pre-Segmentation 

The  development  of  the  segmentation  algorithm  was  based  on  the  idea  that  the  moving  targets 
are  warmer  or  in  the  given  representation  brighter  than  the  environment.  In  this  case  it 
is  possible  to  separate  the  targets  from  the  background  by  means  of  binarisation  of  the 
images  using  an  appropriate  threshold.  Because  of  the  big  variety  of  the  target  intensities 
such  a  threshold  has  to  be  computed  for  each  target  separately.  In  Fig.  12  three  thresholds 
have  been  used  and  it  can  be  seen  that  for  each  target  class  a  different  threshold  is  op¬ 
timal  (a)  for  the  tank,  b)  for  the  APC,  c)  for  the  jeep) .  There  are  well-known  methods  for 
finding  such  object  adaptive  thresholds  if  the  contrast  between  the  object  to  be  separated 
and  the  background  is  fairly  high  and  if  some  a  priori  information  about  the  number  and 
location  of  targets  and  their  types  is  available.  Such  information  was  not  taken  into 
account  and  furthermore  it  was  considered  that  in  the  future  the  contrast  will  become 
smaller  due  to  better  heat-camouflage  techniques.  Therefore  the  following  procedure  was 
chosen:  A  certain  number  of  greylevels  are  selected  systematically  as  thresholds  and 
applied  separately  to  the  total  image  (level-slicing) ,  so  that  a  certain  number  of  binary 
images  is  derived  from  each  greylevel  image.  If  the  local  contrast  of  a  target  is  higher 
than  the  chosen  distance  between  two  adjacent  thresholds,  each  target  is  at  least  separated 
from  the  background  in  one  binary  image.  Feature  extraction  and  object  classification  is 
applied  to  each  binary  image  as  described  in  the  following  paragraphs. 

3.3.  Feature  Extraction 

As  a  result  of  the  binarisation  three  types  of  regions  are  segmented: 

-  those  representing  a  target 

-  those  representing  a  part  of  a  target 

-  those  representing  background. 

In  order  to  be  able  to  discriminate  between  these  types  and  to  classify  the  targets  it  is 
necessary  to  determine  characteristic  features  for  the  different  types  of  regions. 

As  the  images  are  monospectral  and  as  the  application  of  textural  features  is  not  possible 
because  of  the  size  of  a  great  part  of  the  targets  the  only  properties  which  have  been 
used  for  the  description  of  the  binary  regions  (masks)  are  geometric  features  (size,  shape 
and  the  course  of  the  contour)  and  the  distribution  of  the  brightness  within  the  masks. 

A  list  of  all  parameters  with  their  mathematical  definition  is  given  in  Table  1. 

3.4.  Preclassification 

After  describing  the  obtained  regions  by  the  chosen  features  the  desired  target  classes 
are  found  and  separated  by  a  classifier,  background  and  foreground  objects  are  rejected. 
After  the  selection  of  a  classification  system,  the  system  parameters  have  to  be  optimized 
in  a  training  phase.  As  agreed  within  NATO-RSG.9  for  this  training,  10  representatives  for 
each  class  of  targets  had  to  be  used.  The  targets  were  visually  chosen  in  such  a  manner 
that  the  typical  variations  of  each  class  are  found  within  the  training  set.  The  binarisa¬ 
tion  of  the  selected  representatives  was  done  interactively  with  three  different  thresholds 
which  were  chosen  so  that 

-  the  contour  is  completely  outside  of  the  generated  target  region, 

-  the  contour  coincide  with  the  contour  of  the  target  region, 

-  the  contour  is  completely  within  the  target  region. 

By  this  method  of  applying  different  thresholds  for  the  same  target  two  advantages  are  ob¬ 
tained: 

-  the  number  of  samples  is  increased  which  is  necessary  because  of  the  17  fea¬ 
tures  which  have  been  selected  for  describing  the  regions  and 


-  a  target  will  be  detected  on  more  greylevels. 
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size  of  target  region  (SR) 
length  of  perimeter  (LP) 
SR/LP 

max.  radius  of  inertia  (J1) 
min.  radius  of  intertia  (J2) 
J2/J1 
JW2/SR 


P8 

(J1  ♦  J2)/LP 

P9 

angle  a  between  the  horizontal 

P1 0 

mean  greylevel  within 

target  : 

P11 

variance  of  greylevels  within 

P12 

(1/H)l|wi+1-Wi| 

Wi 

H 

P1 3 

(1/W)£|Hi+1-Hi| 

Hi 

W 

P1 4 

(1/H)l|WPi+1-WPil 

WP 

H 

P1 5 

(1/W)l|HPi+1-HPi| 

HP. 

W 

P16 

(1/B)I| X1+1-2Xi+Xi_1l 

Xi 

B 

P1 7 

H/Bl^i^Wi^.l1 

Yi 

width  of  line  i 

height  of  the  comprehensive  rectangle 
height  of  column  i 

width  of  the  comprehensive  rectangle 
number  of  pixels  in  line  i 
see  above 

^  number  of  pixels  in  column  i 
see  above 

x-coordinate  of  pixel  i  on  the  perimet 
number  of  pixels  on  the  perimeter 
y-coordinate  of  pixel  i  on  the  perimet 
B  see  above 


Table  1 ;  Definition  of  features 


By  means  of  the  regression  analysis  a  linear  classifier  was  determine!.  This  classifier  C 
transforms  the  feature  vector  x  into  the  discrimination  vector  d: 

d  =  C  *  x 

estimating  the  similarity  of  the  region  with  the  different  classes.  The  similarity  is  op¬ 
timal  if  the  computed  discrimination-coefficient  is  1.0  and  decreases  with  increasing 
distance  to  1.0.  Moreover  it  must  be  claimed  that  the  similarities  with  the  other  classes 
are  evidently  smaller  to  obtain  an  unambiguous  assignment  to  one  class.  A  region  is  there¬ 
fore  accepted  as  a  probable  target  of  that  class  for  which  the  highest  similarity  was  found 
with  the  additional  conditions  that  the  discrimination-coefficient  is  inside  a  fixed  inter- 
vall  around  1.0  and  exceeds  the  values  of  all  other  classes  by  a  fixed  amount. 

3.5.  Final  Classification 

An  analysis  of  the  Alabama  data  base  revealed  that  the  targets  usually  have  sharp  contrast 
edges  so  that  the  binary  target  areas  are  stable  for  a  certain  number  of  adjacent  thres¬ 
holds.  Background  and  foreground  objects,  however,  may  have  a  similar  size  and  shape  for 
a  single  threshold  but  the  pre-classification  result  changes  rapidly  with  a  modified  bi- 
narisation-threshold  because  of  their  smooth  edge. 

The  investigations  showed  that  for  a  selected  stepwidth  of  2  greylevels  the  false  targets 
usually  are  assigned  to  a  class  not  more  than  two  times.  Therefore  we  accept  a  probable 
target  as  a  real  target  (detection)  if  it  is  assigned  to  any  class  of  targets  three  or 
more  times.  The  class  of  the  target  (classification)  is  that  one  which  was  selected  by 
the  preclassification  for  the  majority  of  thresholds. 

3.6.  The  Experimental  Results 

After  the  optimization  of  the  algorithm  its  performance  was  tested  with  the  complete  image 
data.  A  survey  of  the  results  of  the  different  tests  is  given  in  Table  2. 

In  experiment  1  all  features  and  the  optimal  bounds  for  the  discrimination-coefficients 
were  used.  Here  it  is  possible  to  detect  78  vehicles  out  of  84  ones.  That  means  that  only 
3  tanks,  1  APC  and  2  jeeps  were  lost.  On  the  other  hand  3  false  targets  occurred.  According 
to  our  expectation  the  number  of  classified  targets  was  evidently  smaller.  Only  25  tanks, 

13  A  PCs  and  12  jeeps  were  correctly  recognized. 

In  experiment  2  the  bounds  for  the  discrimination-coefficients  were  modified  in  such  a  man¬ 
ner  that  the  requirements  for  targets  were  higher.  Then  only  36  tanks,  25  APCs  and  13 
jeeps  could  be  located.  But  no  false  target  was  obtained.  In  addition  the  probability  of 
classification  was  equivalent  to  that  of  experiment  1 . 


31-6 


In  experiment  3  a  reduced  feature  set  and  the  optimal  bounds  for  the  discrimination- 
coefficients  were  applied.  The  number  of  detected  targets  was  the  same  as  in  experiment  1. 
But  only  17  tanks,  13  APCs  and  12  jeeps  could  be  correctly  identified.  In  addition  13 
false  targets  occurred. 

In  experiment  4  a  further  reduced  subset  of  the  feature  set  of  experiment  3  and  the  opti¬ 
mal  bounds  for  the  discrimination-coefficients  were  used.  Again  the  number  of  detected 
targets  is  equivalent  to  that  of  experiment  1 .  But  the  results  of  the  recognition  process 
were  still  worse  than  those  of  experiment  3.  Only  8  tanks,  14  APCs  and  9  jeeps  could  be 
assigned  to  the  right  class.  Thus  the  number  of  classified  targets  dropped  to  31.  On  the 
other  hand  the  number  of  false  targets  was  raising  to  16. 


classification 

experiment 

0 

D 

1 

1 

J_ 

C 

D 

2 

1  C 

i 

D 

3 

;  C 

D 

4 

i 

i 

c 

D 

5 

l  C 

i 

D 

6 

1 

| 

C 

Vehicles 

84 

78 

1 

| 

50 

74 

i 

1  49 

78 

!  41 

77 

i 

i 

31 

78 

1 

1  49 

70 

1 

1 

46 

Tank 

40 

37 

1 

25 

36 

!  23 

38 

!  17 

38 

1 

1 

8 

38 

1  21 

34 

1 

1 

22 

APC 

29 

28 

i 

| 

1  3 

25 

!  14 

27 

!  13 

26 

i 

i 

14 

27 

1  15 

23 

1 

| 

11 

Jeep 

15 

13 

1 

12 

13 

!  12 

1  3 

;  ii 

13 

i 

1 

9 

13 

1  13 

13 

1 

| 

13 

false  alarm 

3 

1 

1 

0 

i 

1 

13 

i 

i 

16 

i 

i 

i 

12 

1 

l 

8 

1 

1 

0  total  number  of  targets  in  image  1  to  43  of  the  Alabama  data  base 

1  features  P.-P._;  threshold  d. 

11/  A 

2  features  P.-P,_;  threshold  d* 

11/  A 

3  features  P3 ,Pg ,P? , Pg , P1 2 , P1 3 ,P1  ,P1 ? ;  threshold  dft 

4  features  Pg ,P? ,Pg ,P1 g , P1 ? ;  threshold  dft 

5  features  P3 ,P? , Pg , P1 g ,P17 ;  threshold  dft 

6  features  Pg ,  P?  ,Pg  ,P1  g ,  P1  ? ;  threshold  d:* 

D  detection  dft  =  0.5/1. 1/0. 2 

C  classification  d*  =  0.6/1. 1/0. 3 

Table  2:  Results  for  detection  and  classification 

In  experiment  5  another  subset  of  the  feature  set  as  used  in  experiment  3  and  th-i  opti¬ 
mal  bounds  were  applied.  This  time  however  not  only  the  probability  of  detection  but  also 
the  probability  of  classification  was  equivalent  to  those  of  experiment  1.  The  number  of 
false  targets  was  again  evidently  higher  than  in  experiment  1 . 

In  experiment  6  the  same  features  as  in  experiment  5  and  the  modified  bounds  for  the  dis¬ 
crimination-coefficients  applied  in  experiment  2  were  used.  Here  only  70  targets  namely 
34  tanks,  23  APCs  and  13  jeeps  were  detected  while  8  false  targets  occurred.  The  result 
of  the  recognition  process  differs  from  that  of  experiment  5  only  in  the  number  of  correctly 
classified  APCs.  Now  11  instead  of  15  ones  could  be  correctly  identified. 

The  obtained  results  can  be  summarized  as  follows: 

-  approximately  90%  of  the  targets  of  the  Alabama  data  base  can  be  detected 
with  only  a  few  number  of  false  alarms 

-  50-60%  of  the  targets  can  be  recognized 

-  the  number  of  features  can  be  reduced  to  5  with  only  a  small  decrease  in  the 
detection  and  classification  rate 

-  non  detected  targets  are  usually  located  in  such  a  difficult  environment,  that 
there  is  nearly  no  chance  for  a  detection  by  algorithms  with  a  similar  basic 
philosophy  (this  effect  could  be  demonstrated  by  the  mutual  evaluation  of  the 
results  within  the  cooperative  research  project) . 
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Figure  2:  Correlation-Tracker  (Principle) 
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Figure  7:  Principle  of  Background  Correlation 
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SUMMARY 


In  complex  air  combats  multiple  target  tracking  and  target  identification  is  a  very  dif¬ 
ficult  task  for  a  pilot.  To  support  the  pilot,  procedures  for  automatic  tracking  and 
identifying  processing  multi-sensor-data  and  sensor-images  should  be  developed. 

The  paper  presents  a  procedure  for  non-cooperative  target  identification  of  aircrafts  by 
pattern  recognition  applied  to  TV/IR-sensor-images.  By  a  multi-sensor-tracking  of  high 
accuracy  the  position  of  an  aircraft  is  very  well  known.  An  imaging-sensor  with  a  narrow 
field  of  view  is  pointed  to  the  aircraft  if  it  is  in  range.  The  image  is  then  processed 
for  significant  parameters.  The  contours  are  extracted  by  adequate  methods  of  image  pro¬ 
cessing  and  evaluated  for  geometric  relations  which  are  independent  of  the  projection 
angle  and  specific  for  individual  types  of  aircrafts.  The  application  to  typical  air¬ 
craft  contours  shows  that  the  ensemble  of  the  extracted  parameters  can  identify  different 
types  of  aircrafts  with  some  significance. 

Furthermore  an  algorithm  based  on  momentum  invariants  is  proposed.  The  algorithm  was 
implemented  and  tested  on  a  digital  computer  by  means  of  simulated  noisy  images.  Some 
examples  of  identification  results  are  presented  and  discussed. 

1 .  INTRODUCTION 

Future  airborne  weapon  systems  will  have  a  higher  degree  of  automation  and  integration 
for  target  detection  and  identification.  During  air  combat,  especially  dog  fight 
situations,  the  pilot's  workload  is  extremely  high.  He  has  to  perform  concurrently  a 
lot  of  tasks  under  severe  physical  conditions  in  a  high-g-maneuver  environment.  Under 
these  conditions  it  is  necessary  to  support  the  pilot  in  target  indentif ication  by 
automatic  and/or  semiautomatic  procedures. 

An  autonomous  air  combat  is  characterized  by  the  fact,  that  only  a  primary  radar  can 
detect  and  localize  a  target  within  the  range  of  an  air-to-air  missile.  But  the  reso¬ 
lution  of  a  radar  does  not  allow  to  distinguish  between  friend  and  enemy  types  of 
aircrafts.  A  secondary  radar  (IFF)  only  can  recognize  if  the  target  is  cooperative 
(answering)  or  non-cooperative,  but  is  does  not  make  sure  that  a  non-cooperative  tar¬ 
get  is  an  enemy.  On  the  other  hand  the  resolution  of  the  direction  of  an  secondary 
radar  is  very  low.  Therefore  it  cannot  discern  targets  which  are  close  together.  In 
combat  situations  with  friend  and  enemy  aircrafts  it  is  still  today  often  necessary 
to  close  up  for  a  visual  target  identification  /I/.  By  electro-optical  means  one 
tries  to  have  larger  range  for  target  identification. 

The  objective  of  this  paper  is  to  propose  some  methods  for  multisensor  target  detec¬ 
tion  and  target  tracking  and  performing  target  identification  by  pattern  recognition 
applied  to  electro-optical  sensor  images. 

2.  MULTI -SENSOR-APPROACH 


Information  about  an  airborne  target  can  be  collected  by  different  sensors,  for 
example 

-  radar 

-  radar  warning 

-  TV/IR-imaging  sensor 

As  shown  in  figure  1  every  sensor  gets  information  about  the  target,  which  is  cor¬ 
rupted  by  noise.  For  evaluation  of  this  information  the  signals  of  an  individual 
sensor  may  be  filtered.  But  inspite  of  displaying  these  filtered  signals  directly  to 
the  observer,  who  only  can  obey  the  outputs  of  one  information  channel  at  a  time,  it 
is  more  convenient  to  evaluate  the  filtered  information  of  all  sensors  together  in  a 
sensor  fusion  by  automatic  correlation/combination  of  the  information  of  all  sensors 
and  known  information  about  the  objects  to  be  identified  stored  in  a  catalog.  Evalu¬ 
ating  the  total  information  the  correlation  process  has  higher  probability  of  iden¬ 
tifying  the  target.  The  results  then  are  displayed  to  the  observer  who  probably  has 
to  perform  additonal  interactive  functions  concerning  his  greater  abilities  in 
interpreting  complex  situations. 

Applying  this  concept  of  sensor  fusion  it  is  obvious  that  the  characteristics  of  the 
sensors  complete  each  other.  Radar  and  radar  warning  can  detect  a  target  in  far 
ranges,  but  they  cannot  positively  identify  a  target  on  a  non-cooperative  basis  be¬ 
cause  the  target  appears  usually  only  in  one  resolution  cell  (this  might  not  be  true 
for  jet  engine  modulation  and  for  a  more  complex  radar  with  highest  resolution) .  If 
the  target  is  closing  up  the  radar  can  give  information  to  the  imaging  sensor  to 
direct  them  onto  the  target. 
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If  the  target  is  in  the  range  of  radar  and  TV/IR-sensors  the  information  of  these 
sensors  can  be  used  to  accurately  track  the  target.  For  multiple  target  tracking  the 
characteristics  of  the  sensors  complete  each  other.  The  radar  has  a  high  resolution 
in  range  and  range  rate  but  a  low  update  rate  when  operating  in  a  scan  mode. 

The  TV/IR-sensors  have  a  high  resolution  in  azimuth  and  elevation  and  a  extremely 
high  update  rate.  Combining  this  information  by  applying  an  appropriate  tracking 
filter  one  can  accurately  track  the  targets.  From  the  target  track  the  kinematics  of  the 
flight  path  can  be  concluded  and  the  characteristics  of  the  target  derived  such  as: 

-  velocity 

-  flight  hight 

-  g-maneuver 

-  climb  rate 

These  from  the  flight  path  extracted  data  can  be  compared  with  stored  data  of  target 
types,  i.e.  that  an  aircraft  type  of  the  catalog  does  not  fulfill  the  performance  of 
the  target  and  therefore  it  cannot  be  that  type. 

If  the  target  is  even  closing  up  the  TV/IR-sensors  should  be  switched  to  a  narrow 
field  of  view  and  by  knowing  the  target  position  very  accurately  directed  onto  the 
target.  If  the  target  is  represented  by  an  appropriate  number  of  pixels  in  the  image 
it  can  be  evaluated  by  pattern  recognition  which  is  discussed  in  some  approaches  in 
the  4th  chapter.  To  support  the  digital  image  processing  the  accurately  known  position 
of  the  target  can  be  used  to  reduce  the  amount  of  computing  by  processing  only  that 
part  of  the  image  in  which  the  target  is  located.  Furthermore  the  known  flight  path 
of  the  target  together  with  the  known  own  flight  angles  can  be  used  to  roughly  esti¬ 
mate  the  aspect  angle  of  the  target  which  has  advantage  for  the  pattern  recognition. 

3.  SENSOR  CHARACTERISTICS 

Airborne  target  identification  can  be  performed  by  the  following  types  of  passive 
imaging  sensors  /II/: 

o  TV-systems  (Daylight-TV,  LLL-TV) 
o  Thermal  imaging  devices  (FLIR) 

The  quality  of  imaging  systems  can  be  described  by  means  of  the  following  character¬ 
istics/parameters  : 

-  SNR  (signal  to  noise  ration) 

-  geometrical  resolution 

-  minimal  contrast  at  display  (for  TV) , 

MR  AT  (minimal  resolvable  temperature  difference) 

(for  imaging  IR,  FLIR) 

-  dynamic  range  (detectors,  display) 

-  saturation  effects 

-  lag,  image  blurring 

-  scan  format  (number  of  lines,  frame  rate  interlacing) 

-  video  output  (seriell,  parallel) 

-  degradation  by  atmospheric  effects 

-  night  capability  (bad  weather  capability) 

PERCEIVABLE  IMAGE  CONTRAST 

In  order  to  perceive  a  certain  apparent  image  contrast  C.  this  contrast  (or  the  modu¬ 
lation  of  the  signal  carrier  n,  e.g.  signal  current,  intake  intensity  etc.)  must  be 
superior  by  the  factor  k  to  the  SNR: 


0 


ci  >  kr  6(n) 
n 

with  n  =  mean  value  of  signal  carrier, 

Bin)  =  variance  of  n  (fluctuation,  noise) 

This  relation  is  valid  for  each  type  of  perception,  even  for  film  granulation,  and 
image  (background)  structure  /3/. 

The  image  quality  is  not  only  degraded  by  noise,  it  is  also  affected  by  the  transfer 
characteristics  of  the  imaging  system  (mostly  low  pass  characteristics) .  This  in¬ 
fluence  can  be  described  by  the  MTF  (modulation  transfer  function) .  In  respect  to  a 
characteristic  parameter  (e.g.  spatial  frequency,  angle  resolution)  the  transition 
from  the  object  contrast  CQ  to  the  apparent  image  contrast  can  be  described  as 

Ci  (Of)  =  Co  *  Tm  (0O 
Tm  (a)=MTF  curve 

For  the  recognition  of  an  object  (with  means  of  a  display)  must  be  fulfilled: 
C0_Clm|n  (  C*  )  a  kr  6  (n) 

Tm  (  a  )  Tm  (  Of  )  n  (Of)  /  out 
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This  equation  is  to  be  normalized  by  means  of  the  noise  figure  F, 


F 


i 

( 


6~  (n)  /n)  /out, 
6  (n)/n)2/in 


7N 


with  =  quantum  yield 


Tm  (  a  )  .  n  («)  , 

G"  (  n  ) 

In  this  equation  the  noise  figure  F  is  to  be  used  corresponding  to  the  type  of  image  tube 
or  detector.  This  noise  figure  is  as  that  equation  demonstrates,  a  significant  factor 
to  characterize  the  system  performance.  However,  additionally  the  system  is  to  be  de¬ 
scribed  as  a  function  of  resolution  and  object  structures,  most  useful  by  the  MTF. 

A  recent  experimental  investigation  of  the  recognizability  of  details  in  TV-pictures 
which  are  impaired  by  noise  confirms,  that  the  SNR  should  be  superior  to  10  dB....20  dB- 
depending  on  the  details  to  be  recognized.  /6/. 

SOURCES  OF  NOISE 


The  generation  of  noise  is  based  on  two  effects: 

1.  thermical  noise  and  2 . statistically  fluctuations  (quantum  noise).  Thermical  noise  is 
generated  by  the  load  resistance  of  the  detector  and  by  the  preamplifier. 

The  quantum  noise  is  caused  by  two  sources;  one  internal  source  (that  is  the  detector 
dark  current)  and  one  external  source  (that  is  the  inherent  quantum  noise  of  the  re¬ 
ceived  radiance) . 

In  every  imaging  system  the  detector  current  consists  of  three  parts: 


-  signal  current:  I  . 

This  current  is  generated  by  irradiance  power  carrying  the  desired  information 

-  background  current:  I  . 

This  current  represents  the  undesired  radiance  from  the  background 

-  dark  current:  1^. 

This  current  isgenerated  internally  by  the  detector. 

MTF  (modulation  transfer  function) 

The  total  MTF  of  an  operating  system  results  from  a  convolution  of  the  MTF  of  the  atmos¬ 
phere  with  the  MTF  of  the  EO-Sensor  (both  are  of  low-pass-characteristics) .  The  MTF  of 
the  sensor  is  determined  by  the  MTF  of  the  optics,  by  the  MTF  of  the  detector  (e.g.,  de¬ 
tector  array  or  image  tube) ,  and  by  the  MTF  of  the  video-amplifier.  The  total  MTF  can  be 
slightly  adjusted  by  controlling  the  frequency  characteristic  of  the  video  amplifier. 

Usually  optics  are  diffraction  limited  and  the  detector  characteristics  are  of  most  in¬ 
fluence.  Figure  2  shows  the  modulation  depth  of  a  2/3"-  and  of  a  1”-Vidicon  in  respect 
to  the  line-number. 


RESOLUTION 

The  max.  resolution  of  an  EO-system  is  to  be  calculated  from  the  necessarily  SNR  and  the 
MTF— curve . 

For  TV-image  tubes  the  SNR  degrades  with  decreasing  illuminance  on  the  face  plate  and  with 
decreasing  image  contrast,  and,  therefore,  the  resolution  degrades  too,  as  shown  in 
figure  3.  These  or  similar  curves  are  shown  in  the  data  sheets  of  the  manufacturers. 

(The  IR-Sensors  are  of  similar  behavior). 

The  highest  resolution,  that  can  be  obtained  at  a  certain  contrast  level  determines  the 
recognizability  of  objects  and  their  detail  structures.  This  relation  is  valid  for  visual 
observation  via  monitor  and  for  automatic  image  processing. 

High  stabilization  quality  is  necessary  to  avoid  performance  degradation  due  to  vibrations 
of  aircraft  and  shock. 


OPERATIONAL  CHARACTERISTICS 

For  a  realistic  application  the  optical  aperture  is  limited  by  technical  and  financial 
reasons.  For  diffraction  limited  optics  the  maximal  angle  resolution  is  given  in  figure  4. 
The  corresponding  ranges  for  imaging  sensors  with  these  optics  are  displayed  in  figure  5 
for  several  target  sizes,  but  without  the  atmospheric  influences.  Under  these  conditions 
we  obtain  with  a  high  resolution  TV-System  (a  a  0,025  m  rad)  identification  ranges  from 
approximately  4  km  at  head-on  aspect  up  to  more  than  15  km  for  full  target  size. 

But  under  various  conditions  the  sensor  range  can  be  limited  severeously  by  the  extinc- 
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tion  of  the  atmosphere.  The  atmospheric  extinction  is  very  high  for  horizontal  path 
near  ground,  but  rapidly  decreasing  with  height. 

Therefore  the  atmospheric  extinction  in  air  combat  situations  is  -  despite  of  clouds  - 
mostly  extremely  low  and  than  the  ranges  are  only  determined  by  the  sensor  performance. 

For  a  model  clear  standard  atmosphere  contours  of  constant  atmospheric  transmittance  are 
calculated  /5/  at  different  radiation  wave  lengths  (figure  6).  The  sensor  range  is 
severeously  limited  for  horizontal  path  near  ground,  but  increases  rapidly  for  slant  range 
with  increasing  height  of  the  observer. 

The  sensor  range  at  high  atmospheric  transmittance  is  only  limited  by  the  geometrical 
resolution  and  the  SNR.  (For  special  purposes  the  transmittance  can  be  calculated  by 
using  the  LOWTRAN-Program. 

Figure  7  / 4/  displays  the  probability  of  target  detection  in  respect  to  the  target 
distance  (horizontal  path  on  sea  level)  or  a  TV-camera  (625  lines,  3/4“-Vidicon  (resolu¬ 
tion  »0,06  m  rad)):  The  dashed  line  represents  the  limit  given  be  the  SNR,  the  dotted 
line  displays  the  ranges,  for  5%  contrast  at  the  entrance  of  the  optics  caused  by  atmos¬ 
pheric  extinction.  For  a  target  with  high  contrast  (K  =  1 ;  m  =  100%)  the  resulting  range 
(curve  1)  is  also  affected  by  the  atmospheric  extinction  and  by  the  SNR  of  the  Sensor. 

For  a  target  with  low  contrast  (K  =  0,3;  m  =  22%)  the  possible  range  is  only  determined 
by  the  atmospheric  transmittance. 

For  application  of  IR-Sensors  -  despite  of  the  better  atmospheric  transmittance  -  the 
principals  are  very  similar  as  figure  8  displays.  Curve  1  shows  the  limitation  by  the 
SNR  of  the  detector,  curve  2  is  given  by  the  geometrical  resolution;  the  resulting 
curve  3  is  obtained  by  the  convolution  of  1  and  2. 

4.  SOME  CONCEPTS  OF  PATTERN  RECOGNITION  FOR  TARGET  IDENTIFICATION 

In  the  discussion  of  the  multi-sensor-approach  for  target  detection  and  localization 
the  importance  of  electro-optical  imaging  sensor  for  identification  was  pointed  out. 

The  previous  chapter  dealt  with  the  characteristics  of  such  sensors.  In  the  concept  to 
be  discussed  it  is  supposed  that  a  single  static  image  is  available  in  which  the  tar¬ 
get  aircraft  is  represented  in  a  sufficient  number  of  pixels  with  sufficient  contrast 
to  the  background.  That  image  is  digitized  and  used  for  target  identification  by  di¬ 
gital  image  processing. 

Three  basic  concepts  of  pattern  recognition  for  identifying  aircrafts  will  be  dis¬ 
cussed  in  some  detail: 

-  first  concept  using  invariant  length  ratios 

-  second  concept  using  contour  correlation 

-  third  concept  using  aerea  momentum. 

For  all  three  concepts  that  part  of  the  image  containing  the  target  is  evaluated. 

Methods  of  image  restoration  may  be  applied  if  the  image  is  corrupted  by  noise.  If 
the  results  of  evaluating  the  image  are  represented  to  the  observer  in  an  interactive 
mode  methods  of  image  enhancement  may  be  used.  The  next  step  in  processing  the  image 
is  to  digitally  extract  edges  and  special  characteristics  out  of  the  image  which  are 
used  for  interpreting  geometrical  structures  and  to  find  the  contours  of  the  target 

image.  These  features  can  be  detected  by  the  contrast  to  the  background  or  to  other 

structures.  For  IR-images  contrast  in  the  image  are  caused  by  temperature  differences. 
Therefore  leading  edges  of  the  wings  and  the  tail,  the  fuselage  nose  and  the  engines 
can  be  detected. 

The  first  concept  of  pattern  recognition  uses  aspect-angle-invariant  length  ratios. 
Aspect-angle  invariance  means  that  no  absolute  lengths  or  angles  should  be  evaluated. 

The  basic  idea  is  to  only  evaluate  length  ratios  which  reproduce  themselves  for  any 
projection  angle  as  shown  in  figure  9.  Using  the  center  fuselage  line  (which  is  de¬ 
fined  in  the  image  by  the  line  from  the  fuselage  nose  to  the  center  of  the  engine 

outlets)  as  reference  line  one  can  define  such  length  ratios  for  example: 

ratio  of  length  of  the  line  from  the  intersection  point  of  the  leading  edges  of  the 
wings  (or  tail  wings)  to  the  engine  outlets  compared  to  the  length  of  the  reference 
line 

ratio  of  the  length  of  the  line  from  the  air  inlets  to  the  engine  outlets  compared 
to  the  length  of  the  reference  line. 

Taking  the  wing  span  as  a  second  reference  line  one  can  define  additional  length 
ratios  for  example: 

-  ratio  of  tail  wing  span  compared  to  the  wing  span. 

These  length  ratios  are  evaluated  in  advance  for  the  aircraft  types  to  be  identified 
and  stored  in  a  catalog.  In  the  image  of  the  target  the  detected  edges  are  inter¬ 
preted  in  their  meaning  by  using  geometrical  information  of  aircraft  structure,  for 
example  that  the  intersection  of  the  leading  edges  of  the  wings  are  near  by  the 
middle  of  the  fuselage  or  that  the  ends  of  the  leading  edges  of  the  wing  have  by 
symmetrical  reasons  (also  in  any  projection)  the  same  distance  from  the  fuselage 
reference  line.  One  has  to  recognize  that  this  is  not  an  easy  interpretation  re¬ 
garding  for  example  "normal"  aircrafts  of  the  type  delta-canard. 


After  identifying  the  meaning  of  the  edges  the  length  ratios  as  mentioned  before  can 
easily  be  deducted.  By  comparing  these  length  ratios  with  the  stored  data  in  the 
catalogue  the  special  type  of  the  aircraft  can  be  identified  with  some  signif icance . 
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In  figure  10  an  aircraft  of  the  type  F-18  is  shown.  From  this  picture  the  length 
ratios  a,  (a.  =  A',/L'.  i  =  1,3, 6, 7  and  a.  =  A'„/S',  1  means  projection)  are  deducted. 

The  same  parime ter s  were  derived  in  advance  from  aircraf t^views  of  MIG-25,  F-15  and 
F-18.  Evaluating  the  criteria,  as  shown  in  figure  11,  (a.  means  parameter  of  the  cata¬ 

logue  aircraft  k)  the  aircraft  of  figure  10  is  identified  with  significance  as  a  F-18 
/  2  / . 

The  second  concept  of  pattern  recognition  for  aircraft  identification  is  the  corre¬ 
lation  of  the  contour  of  the  target  aircraft  with  stored  contours  of  catalogue  air¬ 
crafts.  The  image  must  be  processed  for  finding  the  outer  contour  of  the  target  air¬ 
craft.  In  a  higher  contrast  environment  this  may  not  be  a  difficult  task.  But  one  has 
to  take  into  account  the  effects  of  shadows  in  a  TV-image  and  temperature  contrast 
within  the  contour  in  an  IR-image.  The  contour  is  then  extracted  in  the  gray-shades, 
white  and  black.  This  contour  has  to  be  correlated  to  the  contours  of  catalogue  air¬ 
crafts.  It  seems  to  be  convenient  to  store  the  catalogue  aircrafts  as  three-dimen¬ 
sional  structures.  For  each  catalogue  aircraft  a  number  of  cross-sections  of  the  fuselage 
are  stored.  The  wings  are  described  by  the  corners  of  polygons  where  symmetrical 
structures  should  be  stored  only  for  one  part.  For  the  correlation  of  the  contours  the 
projection  of  this  stored  three-dimensional  structure  must  be  calculated  by  means 
similar  to  "computer  aided  design".  This  calculation  has  to  take  into  account  the 
distance  of  the  target  aircraft  to  get  the  right  size  of  the  projected  contour  and 
the  aspect  angle  to  get  the  right  projection.  If  the  aspect  angle  is  not  known  very 
exactly  one  may  calculate  the  contours  for  slightly  modified  aspect  angles.  These 
calculated  contours  of  the  catalog  aircraft  then  must  be  correlated  with  the  contour 
of  the  target  aircraft.  This  can  be  performed  as  an  area  correlation  for  two  images 
with  one  bit  gray  shade  representation.  The  maximum  of  the  correlation  values  of  the 
contours  of  all  catalog  aircraft  identifies  with  some  significance  the  target  air¬ 
craft. 

The  third  concept  is  discussed  in  more  detail  in  the  following  chapter. 

Besides  the  above  mentioned  approaches  in  the  literature  a  variety  of  methods  are  de¬ 
scribed,  with  more  or  less  reference  to  their  general  practical  usefulness  for  our 
applications.  For  computer  implementation  and  tests  a  bearing  concept  had  to  be  estab¬ 
lished. 

5.  PROPOSED  METHOD  SUPPORTED  BY  COMPUTER  SIMULATION  RESULTS 


5.1  Assumptions  and  requirements 

For  practical  studies  supported  by  computer  implementation  and  tests  a  method  had  to 
be  selected  which  would  supply  the  following  properties. 

-  few  computational  work  and  storage  (real-time  processing!) 

-  sensitive  for  structural  perturbations  (global  geometric  properties)  (high  degree 
of  separability  of  aircraft  types) 

-  rugged  against  local  noise  and  local  geometric  perturbations  (weapon,  optical 
effects,  filtering  errors,  edge  errors) 

-  translatorical  and  scale  invariance  (sufficient  to  a  certain  degree) 

These  properties  promised  to  be  provided  by  the  well-known  momentum  method  used  in 
the  OCR.  Thus  the  algorithms  described  in  this  chapter  and  initially  implemented  on 
a  minicomputer  are  based  on  this  method  with  respect  to  the  central  identification 
process.  The  main  results  of  some  tests  are  presented  in  this  chapter  and  further 
works  in  this  direction  are  proceeding. 

The  algorithms  were  developed  under  the  assumption  of  the  availability  of  range  and 
target  orientation  data.  The  identification  process  here  described  works  on  the 
premises  that  the  assumed  relative  target  position  is  roughly  known  (a  X-Y-window 
in  the  image  plane  with  a  possible  complete  target  anywhere  in  it)  and  that  the  rela¬ 
tive  aspect  angles  are  known  with  sufficient  accuracy.  This  information  could  for 
example  be  extracted  by  a  radar-based  target  tracker. 

The  following  aircraft  types  were  used  as  target  data  base: 

F-14  A,  F-16  A,  SU-15,  F-18,  Tornado,  SU-11,  MIG-23,  MIG-25  (cf.  figure  12,  13). 

The  target  size  was  varied  near  10  pixels  per  0.5*  target  dimension  and  background 
noise  was  generated  by  simulating  a  Gaussian  random  noise  with  uncorrelated  error 
probability  of  0.1  to  0.2  per  background  pixel. 

5.2  Description  of  used  pattern  recognition  methods 
PREMISES 

The  flowchart  shown  in  figure  14  assumes  the  digitized  image  of  a  complete  target 
within  a  "target  window"  sized  b  x  h  pixel  and  stored  in  the  core  of  the  di¬ 
gital  identification  computer.  Furthermore  it  is  assumed  that  time-correlated 
distance  and  target  orientation  are  stored. 


J 
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IMAGE  PREPROCESSING 

Noise  abatement  within  the  target  window  is  done  by  an  appropriate  two  dimension¬ 
al  filter  (window  threshold  function)  and  as  a  result  a  preprocessed  image  is 
stored  for  further  processing.  Since  small  sized  target  patterns  or  target  pattern 
elements  may  suffer  local  contour  pertubations  the  proper  identification  process 
must  be  sufficiently  insensitive  to,  without  loss  of  sensitivity  to  the  essen¬ 
tial  distinctive  geometric  feature. 

IDENTIFICATION  PROCESS 

The  identification  process  provides  a  mapping 


I  :  G  <bz;  hz;  x,y) - -►  Rn 

from  the  set  of  gray  value  functions  over  the  target  window  b  ,  h  into  an  n-dimen- 
sional  signature  space  Rn.  n  should  be  chosen  such  that  a  suf?icilnt  separation  of 
the  target  catalogue  is  guaranteed  using  only  reliable  and  efficient  criteria  for 
separation.  In  our  works  we  have  suggested  to  make  use  of  the  momentum  method 
(cf.  m,  /8/  .  /9/ .  / 1 0/)  .  Setting 


B  (x,y):  =  gray  value  function  of  a  pixel  (x,y) 
within  the  target  window  Z  =  bz  x  hz 

the  central  moments  u.  i,  j  e  N  are  calculated: 
i ,  ]  J  o 

-  //(X-X)1  (y-y)j  B(x,v)  d(x-x)d(y-y) 

x  =  //x BCx.yJdxdy  /  //B(x,y)dxdy  7  -  //  yB (x , y ) dxdy  /  //  B (x , y ) dxdy 

2  z  2  2 

For  separation,  the  specific  target  area  (F  (b  ,  h  )  =  |i,,/r  with  respect  to  the 
precalculated  target  distance  r  and  two  algebr§ic  invariants  G1 (b  ,  h  ) ,  G  (b  ,  h  )  based 
on  ii.  i  t  j  2  are  computed.  These  threee  criteria  are  traAslItorically  invariant  and 
rotational  invariant  with  respect  to  the  imaging  plance.  G.  and  G2  are  (theoretically) 
scale  invariant  without  using  r.  In  practice  discretisation  errors  may  cause  some  trouble 
especially  for  small  sized  target  patterns. 


Due  to  translatorical  invariance  it  is  not  necess,  ry  to  find  out  some  "geometric 
center  of  gravity"  or  any  other  reference  point  or  line  for  exact  location  of  the 
target  relative  to  the  origin  of  the  X-Y- image  plane. 

The  rotational  invariance  may  be  used  for  storage  reduction  of  the  target  cata¬ 
logue.  The  invariants  F,  G. ,  G_  of  the  target  window  b  x  h  are  compared  with 
the  respective  values  of  tne  aircrafts  stored  in  the  t3rgetzcatalogue  selected 
via  the  precalculated  target  orientation.  Special  aircraft  properties  like 
variable  wing  geometry  affect  the  target  catalogue  in  that  way  that  for  a  fixed 
orientation  and  aircraft  type  the  characteristics  of  the  aircraft  are  mapped  on 
a  tracetory  in  the  signature  space  RJ  whereas  fixed  wing  aircrafts  are  mapped 
on  a  single  point. 

By  means  of  a  simple  distance  criterion  in  that  space  the  most  probable  aircraft 
type  is  computed.  This  criterion  provides  an  additional  information  about 
possible  alternatives  and  the  reliability  of  the  identification  result. 

5.3  Identification  examples 

The  described  algorithms  were  written  in  FORTRAN  and  together  with  the  target 
catalogue  (cf.  figure  12  and  13)  implemented  on  a  minicomputer. 

The  first  identification  example  (F— 1 8  side  view)  is  to  be  seen  in  figure  15a 
and  15b.  The  identification  result  is  shown  in  figure  16.  The  area  and  shape  in¬ 
variants  F,  G.  and  G,  provide  the  correct  identification  result  even  without 
noise  filtering.  G.  and  G,  offer  the  F-18  and  SU-11  as  possible  candidates.  The 
area  criterion  however  excludes  the  SU-11  so  that  the  F-18  remains  as  correct 
result. 

Figure  15c  and  1 5d  shows  as  second  example  the  top  view  of  a  F-18  before  noisy 
background  and  the  digitally  filtered  target  window. 

Figure  17  again  shows  the  identification  result.  The  shape  invariants  G.  and  G_, 
leave  as  possible  alternatives  the  F-16  A,  F-18  and  MIG-25.  The  area  criteria  F 
again  correctly  identifies  the  F-18. 

5 . 4  Outlook 

The  proposed  basis  algorithms  for  target  identification  are  being  subjected  to 
extensive  efficieny  tests  under  different  environmental  considerations.  Special 
effort  is  applied  to  general  algorithms  for  separation  of  target  areas  and  back¬ 
ground  areas  under  practical  conditions  (i.e.  seriously  perturbed  incoherent 


IR- image  elements,  effect  of  aspect  angle  errors  due  to  measurement  by  other 
sensors  like  tracking  radar) . 

6,  CONCLUSIONS 
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In  aircombat  situations  the  operator  task  performance  can  be  improved  by  multi- 
sensor-based  and  computer-aided  target  identification.  It  has  to  be  considered 
that  in  aircraft  application  the  computer  capacity  is  limited  and  that  the  iden¬ 
tification  process  has  to  be  done  in  nearly  real  time.  The  proposed  methods 
meet  the  requirements  for  aircraft  implementation,  and  the  first  recults  of  our 
investigations  are  very  encouraging. 

The  investigations  are  to  be  extended,  to  determine  the  dependence  of  the  aspect 
angle  and  of  the  impaired  aircraft  contour  due  to  weapon  payload. 
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Fig.  1  Multi-sensor-approach  for  target  identification  (sensor  fusion) 
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Fig.2  Depth  of  modulation  as  a  function  of  the  number  of  lines  z 
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Fig.  4  Thaoratical  angular  rasolution  for  diffraction  limited  optict  (in  mrad) 
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Fig.  14  Identification  algorithm 
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b)  Area  criterion  for  identification 

Identification  of  a  F-18  (side  view)  according  to  Figure  15a  and  b  and  target  catalogue 
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b)  Area  criterion  for  identification 


Figure  17  :  Identification  of  a  F-18  (top  view)  according  to  Figure  15c  and  d  and  target  catalogue 
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SUMMARY 


Over  the  past  ten  years  Honeywell  has  been  Involved  In  the  development  of  Infrared  Imag¬ 
ery  target  screeners  under  Internal,  AFAl ,  NV&EOL  and  DARPA  support. 

The  sophistication  of  reconnaissance  and  strike  systems  is  continually  increasing  due  to 
the  high  threat  operational  environment.  Thus,  advanced  Forward  looking  Infrared  (FLIR) 
sensors  are  integrated  on  high  performance  aircraft.  The  task  loading  and  high  informa¬ 
tion  rate  of  advanced  sensors  has  made  it  impossible  for  a  human  to  perform  the  target 
search/detection/recognition  task  accurately,  consistently,  and  in  real  time. 

The  judicious  application  of  image  enhancement,  automatic  control  functions  and  target 
screening  will  improve  the  sensor /oper ator  interface  and  lead  us  closer  to  the  realiza¬ 
tion  of  an  autonomous  target  screener. 

In  this  paper  we  describe  an  autonomous  target  screener  concept.  The  basic  functions  of 
an  autonomous  target  screener  are,  segmentation,  feature  generation,  classification 
(detection/recognition),  and  symbol  generation. 

Image  segmentation  is  the  function  by  which  the  image  is  segmented  in  background  and 
objects  of  interest.  The  image  information  within  these  objects  of  interest  is  processed 
to  generate  a  set  of  features  which  characterize  the  targets  of  interest.  The  classifi¬ 
cation  function  utilizes  a  stati stical /syntactic  classifier  for  detection  (target  vs. 
clutter  decision)  and  recognition  (truck,  tank,  APC,  etc.).  A  symbol  indicating  the 
position  and  type  of  target  is  displayed  on  the  monitor  for  cueing  purposes. 

1.  INTRODUCTION 

Since  1969,  Honeywell  has  been  involved  in  research  and  development  of  infrared  imagery 
target  screeners  with  internal,  Air  Force  Avionics  Laboratory  (AFAL),  Night  Vision  and 
Electro  Optics  Laboratory  (NV&EOL),  and  Defense  Advanced  Research  Projects  Agency  (DARPA) 
sponsor  ship. 

The  high  threat  operational  environment  necessitates  increased  sophistication  of  recon¬ 
naissance  and  tactical  systems.  Thus,  advanced  Forward  Looking  Infrared  (FLIR)  sensors 
are  integrated  on  high  performance  aircraft.  The  high  information  rate  of  advanced  sen¬ 
sors  and  task  loading  has  made  it  impossible  for  human  operator  to  perform  the  target 
search/detection/recognition  task  accurately,  consistently,  and  in  real  time. 

The  judicious  application  of  image  enhancement,  automatic  control  functions,  and  target 
screening  is  expected  to  improve  the  sensor /human  interface  and  eventually  resulting  in 
the  realization  of  an  autonomous  target  screener. 


Different  approaches,  generic  concepts,  laboratory  tests,  and  conclusions  on  autonomous 
target  screeners  are  presented  in  this  paper. 

2.  AUTONOMOUS  TARGET  SCREENER 

The  basic  functional  block  diagram  of  the  future  Autonomous  Target  Screener  System  (ATSS) 
is  presented  in  Figure  1.  It  consists  of  a  Forward  Looking  Infrared  (FLIR)  imaging  sen¬ 
sor,  an  image  enhancement  module,  a  target  screener  and  a  TV  compatible  display. 

The  FUR  with  its  standard  525/875  lines  per  frame,  30  frames  per  second  format  and  TV 
compatible  display  will  not  be  discussed  here.  Image  enhancement  and  target  screener 
will  be  liscussed  next. 

2.1.  Image  Enhancement 

Image  enhancement  for  visual  TV  and  FLIR  has  been  and  continues  to  be  fertile  grounds  for 
research  and  development.  FUR  image  enhancement  may  consist  of  D.C.  restoration, 
responsitivity  equalization,  resolution  and  super  resolution,  minimum  resolvable  tempera¬ 
ture  (MRT),  and  contrast  enhancement.  The  first  four  improve  the  image  quality  and  have 
been  discussed  elsewhere  (Narendra,  P.M.,  1977).  The  last  optimizes  the  interface 
between  the  FLIR  sensor  and  display  and  will  be  discussed  in  this  paper,  because  it  is  an 
integral  part  of  an  autonomous  target  screener  and  has  been  reduced  to  practice 
(Narendra,  P.M.,  1978). 

2.1.1  Contrast  Enhancement 

The  scene  dynamic  ranges  (1000:1)  encountered  by  imaging  sensors  can  be  much  higher  than 
the  CRT  display  dynamic  range  of  (20:1).  In  addition,  the  scene  intensity  extrema  are 
changing  continuously. 
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In  order  to  match  the  imaging  sensor  output  to  the  display  input  we  need  a  global  auto¬ 
matic  gain/bias  control  (AGO)  as  well  as  a  local  area  contrast  enhancement.  The  first 
determines  the  compression/expansion  for  scene  images  to  fit  into  the  display  dynamic 
range  and  the  second  enhances  the  local  contrast. 

The  adaptive  contrast  enhancement  algorithm  performs  the  following  functions: 

•  Vary  the  local  average  brightness  (bias)  so  that  overall  dynamic  range  of  scene 
is  compressed; 

•  Enhance  local  variations  above  the  contrast  sensitivity  threshold  of  the  human 
eye;  and 

•  Automatically  fit  the  intensity  extrema  in  the  enhanced  image  to  the  display 
limits 

A  functional  block  diagram  of  this  algorithm  is  shown  in  Figure  2.  The  image  intensity 
at  each  point  is  transformed  based  on  local  area  statistics  —  the  local  mean  mj .  and  the 
local  standard  deviation  ffjj  are  computed  on  a  local  area  surrounding  the  point. 

The  local  area  mean  is  first  subtracted  from  the  image  at  every  point.  A  variable  gain 
is  applied  to  the  difference  to  amplify  the  local  variations.  A  portion  of  the  local 
mean  mjj  is  then  added  back  to  restore  the  subjective  quality  of  image.  The  local  gain 
Gjj  is  adaptive,  being  proportional  to  M,  to  satisfy  psychovisual  considerations  (Weber's 
law);  and  inversely  proportional  to  Ojj,  so  that  areas  with  small  local  variance  receive 
larger  gain.  J 

The  transformed  intensity  is  then: 

Tij  s  Gij  (Iij  "  mij)  +  raij 
M 

where,  the  local  gain  Gjj  =  a _ ,  0<“<1 

aij 

where  M  is  the  global  mean. 

To  prevent  the  gain  from  being  inordinately  large  in  areas  with  large  mean  and  3mall 
standard  deviation,  the  local  gain  is  actually  controlled  as  shown  in  Figure  3. 

2.2  Target  Screener 

The  target  screener  processes  the  raw  or  enhanced  FI IR  imagery  and  displays  a  symbol  on 
the  display  indicating  the  type  and  location  of  detected  targets.  The  screener  serves  as 

a  cuer  to  the  operator  who  then  makes  the  final  decision. 

The  need  for  an  automated  target  screener  arises  mainly  from  the  task  loading  and  high 
information  rate  of  advanced  sensors.  The  human  operator  is  hindered  in  performing  his 
task  by  display  and  human  limitations.  We  discussed  in  Section  2.1.1  how  contrast 
enhancement  results  in  a  hands-off  display  operation.  Humans  are  limited  by  physiologi¬ 
cal  and  psycological  factors  in  a  threat  rich  environment,  resulting  in  low  probability 
of  detection  and  recognition.  Under  ideal  laboratory  conditions,  human  performance  has 
been  reported  (Krebs,  M.J.,  1974)  in  terms  of  probability  of  detection;  PD  =  70X  and  rec¬ 
ognition;  Pr  =  50%,  with  corresponding  search  average  times  of  4.5  and  5.5  seconds. 
Under  field  tests  (STANO,  1971).  the  acquisition  time  increases  and  the  probabilities  of 
detection  and  recognition  decrease. 

An  autonomous  target  screener  consists  of  the  following  four  functions;  image 

segmentation,  feature  extraction,  classification  and  symbol  generation  as  shown  in  Figure 

4.  Each  of  these  functions  will  be  discussed  below. 

2.2.1  Segmentation 

Segmentation  is  the  function  by  which  the  image  is  segmented  in  background  and  objects  of 
Interest.  There  are  basically  two  approaches  to  segmentation;  one  based  on  edge  and 
intensity  thresholding  and  the  second  is  region  based.  These  approaches  are  discussed 
below. 


2.2. 1.1  Edge  and  Intensity  Thresholding 

This  approach  processes  the  FilR  image  through  an  edge  filter,  determines  the  intensity 
and  edge  statistics,  estimates  local  or  global  thresholds  and  segments  the  image.  This 
approach  is  called  autothreshold  and  adapts  the  edge  and  intensity  thresholds  to  changing 
scene  contrast  and  intensity  levels  (Panda,  1978). 

Figure  5  shows  the  overall  concept  for  autothreshold.  Briefly  the  function  of  each  box 
is  as  follows:  Raw  video  is  passed  through  a  low-pass  smoothing  filter.  This  limits  the 
bandwidth  of  the  noise.  The  smooth  data  is  an  input  to  both  the  edge  filter  and  the 
bright  filter.  The  edge  filter  generates  the  edge  magnitude  of  the  smoothed  data  from 
which  an  edge  threshold  is  determined  for  the  next  scan  line.  The  output  "Edge"  is  a 
binary  signal  obtained  by  comparing  the  analog  edge  signal  with  the  edge  threshold.  The 
bright  filter  determines  the  background.  The  background  image  is  processed  to  estimate 
the  bright  threshold  and  subtracted  from  the  raw  image.  Comparing  the  latter  with  the 
Intensity  threshold  the  binary  "bright"  signal  is  generated. 
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The  smoothing,  filter  is  a  weighted  average  filter  based  upon  the  following  equation 
I(i,J)  =  [A(i-1,J-1)  +  2A( i-1 , J )  +  A( i-1 , J+1  ) 

+2A( i , j  —  1 )  ♦  4A( i  ,  J )  +  2A( i , j+1 )  (1) 

+A( i  +  1 , J-1 )  +  2A ( i+ 1 , J )  +  A( i+1 ,  J+1 )  ] 

where  I(i,j)  is  the  smoothed  output  at  location  (i,J)  and  A(i,j)  is  the  video  intensity. 
This  filtered  video  is  then  the  input  to  the  edge  filter  and  to  the  bright  filter.  Fig¬ 
ure  6  is  a  sample  of  five  scan  lines  of  FLIR  video  over  a  tank.  Figure  7  is  the  resul¬ 
tant  smooth  data. 

The  two-dimensional  edge  filter  (Sobel)  calculates: 


HJ-1 

=  I ( i-1 , J-1 ) 

+ 

21  ( i  ,  j- 1 ) 

♦ 

I  ( i+1 ,  j-1) 

HJ+1 

=  I(i-1, J+1) 

♦ 

2I(i,  j+1) 

♦ 

I ( i+1 , j+1 ) 

(2) 

vi-1 

=  IU-1, J-1) 

+ 

21 (i-1 , j) 

+ 

1(1-1 , j+1 > 

vi+1 

=  I ( i+1 , j-1 ) 

+ 

21  ( i+1 , j ) 

I  ( i+1 , j+1 ) 

where  H  and  V  are  the  horizontal  and  vertical  components.  Then  the  edge  value  associated 
with  (i,j)  pixel  is 

E(i,j)  =  !HJ+1  -  Hj.t!  +  I Vi+i  -  Vi.,!  (3) 

Figure  8  is  the  Sobel  edge  for  the  data  shown  in  Figure  7.  Superimposed  on  the  edge  data 
is  the  adaptive  edge  threshold.  The  edge  threshold  is 

En  *  *  *  (4) 

where  En  is  the  threshold,  En_i  is  the  previous  scan  line  edge  average  and  K  is  an  opti¬ 
mum  constant  statistically  determined. 

As  was  mentioned  previously,  the  smoothed  video  is  fed  into  the  bright  filter.  The  pri¬ 
mary  function  of  the  bright  filter  is  to  estimate  the  background  intensity.  This  back¬ 
ground  estimate  is  continually  updated  as  the  image  is  scanned  by  a  recursive  filter  and 
updating  logic.  We  determine  if  there  is  large  contrast  between  scan  lines  on  a  pixel  by 
pixel  basis.  The  background  estimate  J  is  built  up  over  several  scan  lines.  The  back¬ 
ground  is  updated  from  scan  line  to  scan  line  and  is  defined  as 

J(i,j)  =  J ( i-1 , j )  +  (1-B)  I  (  i  ,  j  )  (5) 

where  8  is  a  constant  and  O<0£1 .  When  not  updating  J(n,k)  =  J(n-1,k).  Once  the  back¬ 

ground  J(i,j)  is  determined,  it  is  subtracted  from  I(i,j)  to  give  a  zero  reference. 
Hence,  Z(i,j)  =  I ( i , j)-J(i,  j)  is  data  with  zero  reference  to  be  thresholded.  Z ( i  ,  j )  is 
compared  to  the  bright  threshold  EPSI  defined  as 

EPSI  *  NuWr--8)  !J(i,j)  -  J(i-1|j)l  (6) 

where  u  is  a  constant.  The  output  of  the  comparison  is  the  Bright  signal  and  includes 
hot  and  cold  objects.  Figure  9  shows  the  background  estimate  for  the  smoothed  video 
(Figure  7).  The  resultant  video  to  be  thresholded  is  shown  in  Figure  10  with  the  thresh¬ 
old  superimposed.  The  thresholded  edge  and  intensity  signals  are  combined  in  a  logical 
way  to  extract  an  interval  over  which  an  object  may  exist.  The  presence  of  a  leading 
edge,  followed  by  a  bright  and  concluding  with  a  trailing  edge  constitutes  an  interval. 
If  the  interval  width  falls  between  predefined  max/min  limits,  the  scan  line,  starting 
column,  and  interval  width  are  stored. 

Accumulating  congruent  intervals,  candidate  objects  are  extracted.  Figure  11a  represents 
raw  FLIR  image.  Figure  11b  represents  the  binary  edge,  Figures  11c  and  lid  the  hot  and 
cold  binary  intensities,  Figure  lie  the  object  intervals  and  Figure  Ilf  the  extracted 
objects. 


2.2. 1.2  Prototype  Similarity 

The  image  segmentation  scheme  using  prototype  similarity  transformation  (Aggarwal,  R.K., 
1978)  can  be  divided  into  the  following  major  steps: 

•  Attributes 

•  Prototype  Generation 

•  Threshold  Selection 

•  Prototype  Inference 

•  Cell  Inference 

•  Similarity  Relation 

STEP  1:  Attributes.  A  cell  represents  a  single  pi*..,  or  a  collection  of  pixels 

depending  upon  the  required  resolution  in  the  segmented  scene.  Some  of  the  commonly  used 
attributes  are  average  intensity,  edge  texture,  etc.  Suppoee  x1,...,xN  are  the  N 


J 


34-4 


attributes  characterizing  each  cell.  These  N  attributes  may  be  N  independent 
measurements  on  each  cell  or  may  be  N  functions  of  M  (M>N)  independent  measurements. 

STEP  2:  Prototype  Generation.  For  each  of  these  N  attributes  characterizing  a  cell,  a 
two-dimensional  distribution  function  F(J,i)  is  calculated  as  follows:  Suppose  the 

attribute  value  of  a  cell  is  i.  Count  the  number  of  cells  in  some  experimentally  chosen 
neighborhood  (depending  upon  the  resolution,  size  of  the  target,  etc.,)  that  have  attri¬ 
bute  value  j.  Accumulate  this  sum  for  all  the  cells  in  the  picture  that  have  attribute 
value  i.  This  sum  gives  F(j,i).  Do  this  for  all  values  of  i  and  J. 

Next  initial  background  and  target  prototypes  are  determined  using  a  priori  information 
about  the  scene.  This  can  be  done  by  locating  typical  background  and  target  cells  or  by 
using  some  attribute  information  about  the  background/target.  For  example,  a  running 
motor  is  the  brightest  part  of  tactical  FI IR  images. 

let  the  target  cell  attribute  value  be  AT  and  background  cell  attribute  value  be  Ad. 
Based  on  these  two  values  AT  and  AB,  two  Intervals  !AT»TA,  AT/TA{  and  !AB  TA-AB/TA!  are 
calculated,  where  TA  is  an  empirically  chosen  threshold  on  the  value  of  attribute  A. 

These  two  intervals  are  assumed  to  be  disjoint.  The  case  of  overlapping  intervals 
implies  either  a  bad  choice  of  t ar get/backgr ound  cues  or  a  low  value  of  threshold  TA. 

These  two  disjoint  intervals  define  the  first  two  prototypes  P0  and  P^  For  generating 
additional  prototypes,  consider  the  two-dimensional  distribution  function  shown  in  Figure 
12.  All  the  cells  that  belong  to  prototypes  Pq  or  Pi  are  zeroed  (shown  by  hatched 
areas).  Suppose  the  modified  distribution  function  is  F'(J,i).  Then  for  each  attribute 
value  i,  we  have  an  attribute  profile  of  neighbors.  By  considering  each  value  i  in  the 
intervals  ! ATTA, At/Ta |  and  ! AgTA , Ag/TA i ,  the  cumulative  attribute  profiles  Fp  and  Fp 
are  calculated  as  follows:  0  1 

FP0  =  (7) 

ie[ATTA,  At/Ta] 

FP.  =  X  F'  ( j  *  i  )  (8) 

ie|ABTA)AB/TA] 

An  example  of  the:e  profiles  is  shown  in  Figure  13*  (A  maximum  is  located  in  each  of 
these  profiles.  The  maximum  of  these  maxima  gives  the  location  of  the  next  prototype 
interval.  This  corresponds  to  maximizing  the  probability  of  finding  a  neighbor  that  has 
attribute  value  outside  the  attribute  intervals  of  previous  prototypes.  Suppose  the 
attribute  value  13  Ap.  This  gives  rise  to  an  interval  [A2  AT,Ap/AT]  for  the  prototype 
Pp.  At  this  stage,  there  are  3  prototypes  Pq,  Pi,  and  P2.  Now  there  are  three  cumula¬ 
tive  profiles  for  the  three  intervals.  The  whole  sequence  of  operations  is  repeated 
until  no  more  prototypes  can  be  generated. 

STEP  3:  Threshold  Selection.  A  numerical  value  between  0  and  1  needs  to  be  chosen  for 
each  attribute  for  defining  prototype  intervals.  Too  small  a  threshold  leads  to  larger 
intervals  and  consequently  fewer  number  of  prototypes,  whereas  too  large  a  threshold  will 
lead  to  smaller  intervals  and  larger  number  of  prototypes.  In  the  extreme  cases,  on  one 
hand,  we  may  have  only  two  prototypes  which  will  give  rise  to  too  many  edge  elements;  and 
on  the  other,  we  may  have  many  prototypes  so  that  each  cell  is  similar  to  only  one  proto¬ 
type  giving  rise  to  too  many  different  objects  in  the  scene. 

For  FUR  images,  a  typical  value  for  the  number  of  prototypes  for  each  attribute  is  some¬ 
where  between  10-15.  So  the  thresholds  can  be  adjusted  to  give  the  right  number  of  pro¬ 
totypes. 

STEP  4:  Prototype  Inference.  Let  Pq,...,Pu  be  the  set  of  prototypes.  Each  one  of  these 
prototypes  has  an  interval  on  the  attribute  axis  associated  with  it.  Each  cell  in  the 
picture  is  labeled  by  a  string  of  prototypes  it  is  similar  to.  A  cell  can  be  similar  to 
more  than  one  prototype  as  the  prototype  intervals  can  overlap.  During  the  labeling 
process,  a  co-occurrence  matrix  is  constructed.  Each  element  A.j  in  the  co-occurrence 
matrix,  i  =  0,...,N;  j  =  0,.,N,  corresponds  to  the  frequency  thatJthe  prototypes  Pj  and 

Pj  occur  together  in  labels. 

The  fact  that  the  prototype  Pq  was  generated  by  a  targft  cue  and  Pi  was  generated  by  a 
background  cue  is  used  to  infer  meaning  for  other  prototypes.  The  co-occurrence  matrix 
is  used  to  guide  the  inference.  Suppose  A0j  is  maximum  for  i  =  i ^  and  A i J  is  maximum  for 
J  =  Ji«  Depending  upon  which  one  of  Aq<  ,  and  A^  is  greater,  either  prototype  Pj  or 
Pj  is  considered  for  inferring  its  meaning.  The  billowing  rules  are  used  to  infer  mian- 
in|  for  a  prototype: 

•  A  prototype  whose  interval  overlaps  a  target  interval  and  does  not  overlap  a 
background  interval  is  a  target  prototype. 

•  A  prototype  whose  interval  overlaps  a  background  interval  and  does  not  overlap  a 
target  interval  is  a  background  prototype. 

•  A  prototype  whose  Interval  overlaps  both  target  and  background  intervals  is  an 
edge  prototype. 
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•  A  prototype  whose  Interval  does  not  overlap  target  or  background  Interval  Is 
assigned  the  "meaning  unknown". 

STEP  5:  Cell  Inference.  Each  prototype  in  a  cell  is  replaced  by  its  inferred  meaning. 
The  following  string  grammar  is  used  to  reduce  string  to  a  character: 

TT  ■*  T 
EE  -  E 
BB  -  B 
TB  -  E* 

TE  +  T 
BE  "■  B 

E«a*  E»  ,  ae {T,B,E,E»} 

where 

T  =>  target  cell 
B  =>  background  cell 
E*=>  strong  edge  cell 
E  =>  weak  edge  cell 

STEP  6.  Similarity  Relation.  Based  on  each  attribute,  using  the  above  described  proce¬ 
dure,  a  meaning  can  be  assigned  to  each  cell  of  the  picture.  Thus  each  cell  has  a  string 
of  cell  meanings,  the  length  of  the  string  being  equal  to  the  number  of  attributes 
needed.  This  is  called  a  similarity  relation.  A  cell  should  be  assigned  the  same  mean¬ 
ing  by  all  the  attributes  before  it  is  assigned  that  meaning.  Otherwise,  the  cell  is 
classified  as  "meaning  unknown".  A  more  complex  relationship  can  be  devised  depending 
upon  the  type  of  imagery,  type  of  attributes,  etc. 

The  prototype  similarity  tr ansformation  was  tried  on  FI IR  Images  of  tactical  targets. 
The  technique  was  first  tried  on  full  frames  (520  x  480  pels)  for  target/background 
segmentation  and  then  on  the  isolated  targets  for  component  extraction.  The  target  cen¬ 
ter  and  its  approximate  size  were  recorded  during  digitization.  The  8-bit  digitized  data 
was  scaled  down  to  100  grey  levels  to  cut  the  computer  memory  requirements  for  storing 
joint  distribution  function. 

A  cell  was  defined  as  2  x  2  pels  for  component  extraction  and  as  4  x  4  pels  for 
target/background  segmentation.  A  neighborhood  of  3  x  3  cells  was  used  for  calculating 
the  joint  distribution  function.  A  threshold  of  0.85  was  used  for  defining  the 
attributed  internals. 

The  results  using  the  average  Intensity  over  the  cell  as  the  only  attribute  are  shown  in 
Figure  14.  The  top  picture  shows  the  original,  the  middle  one  the  target/background 
segmentation  on  full  frames  and  the  bottom  one,  the  extracted  components  of  a  tank  tar¬ 
get. 

2.2.2.  Feature  Extraction 

The  image  over  an  extracted  candidate  object  is  processed  to  extract  the  following 
features: 

•  Total  length,  L 

•  Total  area,  A 

•  I ength-to-width  ratio, 

•  Area-to-per imeter  squared, 

•  Average  contrast 

•  Moments:  ®00  ~  a30 ’  a03 

wher  e 

ff 

aPq  =  object  xP  yq  dxdy 

where  I(x,y):  Intensity  at  (x,y) 

The  features  are  used  by  the  classifier  for  training  and  testing  the  classifier. 

2.2.3  Classification 

Target  classification  for  a  small  image,  l.e.  when  the  number  of  pixels  occupied  by  the 
target  is  small,  can  best  be  done  by  statistical  methods.  However,  at  short  ranges,  or 
high  magnification,  "syntactic"  classifier  is  used  to  exploit  the  available  structural 
Information. 


2.2.3* 1  Statistical  Classification 
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Statistical  classifiers  are  rules  for  partitioning  the  feature  space  such  that  target 
classes  are  separated  from  each  other  and  from  non-target  objects  by  the  partitioning 
surfaces.  Design  of  a  statistical  classifier  requires  a  knowledge  of  the  target  and 
non-target  feature  statistical  distributions.  These  are  usually  provided  in  the  form  of 
feature  samples,  extracted  from  representative  imagery.  Determining  the  optimum 
partitioning  from  the  sample  set  is  called  "training"  the  classifier.  A  separate, 
statistically  independent  feature  data  set  is  used  for  "testing",  or  estimating  the  per¬ 
formance  of  the  classifier. 

Three  types  of  classifiers  that  we  use  are  linear,  discriminant  tree  and  k-nearest  neigh¬ 
bor  classifiers.  These  classifiers  can  be  used  for  either  clutter  rejection  or  discrimi¬ 
nating  between  classes  of  targets. 

There  are  several  approaches  to  the  design  of  a  linear  discriminant  classifier.  The  most 
common  utilize  data  from  two  classes  at  a  time.  If  the  data  is  bimodal  and  linearly  sep¬ 
arable  a  simple  and  effective  approach  is  to  use  a  Fisher  linear  classifier.  If  we 
define : 


W  =  (C,  ♦  C  2)_1  (M,  -  M2)  (9) 

T  =  (o j^M2  +  o22M^)/(o1^  +  o2^)  (10) 

where  Ci ,  C2,  M1(  M2  are  the  covariance  matrices  and  means  of  the  two  sets  of  data  and 

=  w'l^W,  the  Fisher  linear  classifier  is  given  by: 

W*"  X  >  T  implies  X  belongs  to  class  1 

X  <  T  implies  X  belongs  to  class  2 

where  X  is  the  feature  vector  of  the  object  being  classified,  and  T  is  the  optimum 
threshold . 

A  tree  classifier  is  simply  a  logic  tree  in  which  each  branch  consists  of  a  linear  dis¬ 
criminant.  In  its  simplest  form,  it  consists  of  a  sequence  of  cascaded  thresholds  on  the 
features.  These  two  approaches  work  well  when  only  two  classes  are  to  be  discriminated 
and  when  the  feature  vectors  are  linearly  separable.  For  a  more  general  case  involving 
more  than  two  classes  and  the  feature  vectors  not  separable,  the  k-nearest  neighbor  (kNN) 
classifier  is  more  suitable.  In  the  kNN  approach,  the  set  of  training  and  the  class 
types  of  the  samples  are  stored.  A  new  sample  vector  X  is  classified  as  follows.  Its 
distances  to  all  the  stored  training  samples  are  found,  and  the  k  nearest  are  found.  The 
new  sample  is  assigned  to  the  class  belonging  to  the  majority  of  its  k  nearest  neighbors. 

Using  the  kNN  classifier  on  FLIR  images,  we  have  been  able  to  achieve  nearly  80%  correct 
classification  between  two  types  of  military  targets.  Better  performance  can  be  expected 
in  some  scenarios  by  further  processing  of  frame-to-frame  sequences  of  decisions.  These 
smoothed  decisions  reduce  the  effects  of  random  errors  in  the  individual  decisions. 

Target  screener  performance  can  also  be  improved  by  using  range  data,  measured  either 
directly  or  indirectly.  Range  makes  possible  absolute  size  discrimination  which  is 
lacking  in  FLIR  imagery. 

2. 2. 3. 2  Syntactic  Classifier 

There  are  cases  where,  the  targets  show  visually  distinguishable  shape  and  internal 
intensity  variation.  Statistical  classifications  of  this  type  of  targets  would  require 
an  unmanageably  large  number  of  features  and  complicated  statistical  distribution.  In 
picture  recognition  problems,  when  the  number  of  features  required  is  very  large,  the 
concept  of  describing  complex  patterns  in  terms  of  a  (hierarchical)  composition  of 
simpler  subpatterns  becomes  very  attractive  (Aggarwal,  R.K.,  1978).  The  description  of  a 
target  in  terms  of  simpler  subpatterns,  i.e.,  target  components,  enables  us  to  perform 
syntactic  recognition  of  targets. 

The  assumptions  in  this  approach  to  tactical  target  recognition  are: 

•  Images  of  tactical  targets  are  "large"  enough  to  show  structures. 

•  It  is  easier  to  recognize  target  components  than  the  target. 

The  first  assumption  deals  with  the  sensor-target  range.  If  the  range  is  too  long  to 
show  any  details  inside  the  target  image,  one  would  have  to  resort  to  statistical  recog¬ 
nition  techniques.  But  as  the  sensor-target  range  decreases  and  the  target  structure 
becomes  discernible,  syntactic  recognition  schemes  become  feasible. 

The  second  assumption  deals  with  the  relative  ease  of  recognizing  the  target  and  its 
components.  If  it  is  easier  to  recognize  a  target  than  its  components,  as  would  be  the 
case  when  the  target  image  is  only  a  few  pixels,  one  would  not  employ  syntactic  recogni¬ 
tion  schemes.  But  in  low-quality  images  where  the  recognition  based  on  target  outline  is 
not  very  reliable,  a  syntactic  scheme  can  be  successfully  used  to  recognize  targets, 
provided  the  assumption  on  target  image  size  holds.  Syntactic  recognition  schemes  can 
also  be  successfully  used  for  partially  occluded  targets  where  conceivable  statistical 
recognition  schemes  would  fail. 
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Consider,  for  example,  the  Image  of  a  tank  depleted  in  Figure  15.  Suppose  It  is  possible 
to  recognize  the  components  of  this  tank,  such  as  motor,  hot  spots,  vents,  barrel,  etc., 
using  statistical  properties  of  each  component  and  their  spatial  relationship.  A  hierar¬ 
chical  (tree-like)  structural  information  in  this  tank  can  be  represented  by  a  tree  as 
shown  in  Figure  16.  Syntax  rules  can  be  used  to  describe  tree  structures  like  the  one  in 
Figure  16.  The  syntax  or  grammatical  rules  for  this  example  are: 

TANK  -►  RECTANGLE,  HOTSPOTS,  BARREL 

RECTANGLE  -  TREAD,  MOTOR,  VENTS 

Since  different  components  of  a  target  may  be  seen  from  different  aspect  angles,  a  gener¬ 
al  set  of  rules  can  be  inferred  by  training  the  classifier  with  tree  structures  of  the 
target  viewed  from  different  aspect  angles.  The  general  block  diagram  of  syntactic 
approach  to  tactical  target  recognition  is  shown  in  Figure  17. 

Syntactic  tank-recognition  scheme  is  discussed  below  as  shown  in  Figure  18.  The  feature 
set  of  each  segmented  component  is  stored  in  a  "label  table"  comprising  the  description 
of  the  component.  Classification  of  these  segments  to  individual  candidate  components  is 
performed  before  syntactic  recognition  can  be  completed.  This  component  classification 
is  done  applying  statistical  technique  to  the  feature  vector  in  the  label  table. 

The  goal  of  syntactic  tank-recognition  algorithm  is  to  use  the  label  table  for  a  possible 
tank  and  decide  whether  the  object  can  be  classified  as  a  tank,  based  on  identification 
of  the  components.  First  the  motor  is  identified  as  the  initial  classification  step.  If 
no  additional  tank  components  can  be  further  identified,  boundary  shape  analysis  is 
required  for  recognition.  An  approximate  orientation  of  the  object  is  determined  using 
the  already  defined  body  and  motor  groups.  Direction  of  the  possible  tank  is  determined 
by  examining  the  location  of  the  motor  group  relative  to  the  body  group.  Search  regions 
are  established  for  locating  hot  spots,  vents,  and  barrel  of  the  tank.  Size,  shape,  and 
direction  features  are  used  for  component  recognition.  If  at  least  one  additional  tank 
component  is  found,  the  object  is  declared  a  tank.  Otherwise,  statistical  methods  based 
on  object  boundary  features  are  needed  for  further  classification. 

Honeywell  has  extensively  tested  syntactic  target  recognition  technique  on  FLIR  tactical 
targets.  Figure  19  shows  an  example  of  syntactic  tank  recognition  in  FLIR  image. 

2.2.4  Symbol  Generation 

Once  a  target  is  detected  and  classified  a  symbol  is  generated  and  displayed  in  (x,y) 
position.  There  is  a  symbol  for  each  class  of  targets  and  its  size  is  proportional  to 
the  size  of  the  target  image.  The  symbol  may  be  updated  every  time  a  decision  is  made. 

3.  CONCLUSION 

This  paper  presented  a  concept  for  an  autonomous  target  screener.  Based  on  the  material 
presented,  the  following  conclusions  are  made: 

•  The  feasibility  of  detecting  and  recognizing  tactical  targets  in  thermal  IR  imag¬ 
ery  has  been  proven  over  a  limited  data  base. 

•  An  autonomous  target  screener  will  require  both  statistical  and  syntactic 
classifiers. 

•  An  integrated  FLIR,  Image  Enhancement,  Target  Screener  system  is  possible  in  the 
near  future. 
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figure  1.  Autonomous  Target  Screener. 


Figure  2.  Functional  Block  Diagram  of  the 

Adaptive  Contrast  Enhancement  Algorithm. 
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figure  3.  Local  Area  Gain  Curve  to  Prevent 
Excessive  Gain  Variations. 
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Figure  4.  Target  Screener. 


Figure  5.  Autothreshold  Functional  Block  Diagram. 


Figure  6.  Five  Scan  Lines  of  Raw  Video 
Over  a  Tank. 


Figure  9.  Background  Estimate  of 
Data  in  Figure  4. 


Figure  7.  Smooth  Video  of  Data  in  Figure  3. 
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Figure  10.  "Video-Background"  of  Data 
in  Figure  4. 


Figure  8.  Edge  with  Threshold  of  Data  in 
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Ilf.  Extracted  Objects 


Figure  11  (cont.).  Image  Segmentation 
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Figure  12.  Two-dimensional  Distribution 
Function  f ' ( j ,i ) . 
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Figure  13.  Accumulative  Attribute  Profiles 
fp  and  fp  . 
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Figure  14.  (a)  FUR  Image  of  a  Tank,  (b)  Tank  Detection  at  low 

Resolution,  (c)  Tank  Detection  at  High  Resolution. 


Figure  17.  Syntactic  Approach  for  Tactical  Target  Recognition. 
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Figure  78.  Syntactic  Tank  Recognition. 
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EVALUATION  OF  A  REALTIME  PROTOTYPE  AUTOMATIC  TARGET  CUER 


Terry  L.  Jones 

U.S.  Army  Night  Vision  &  Electro-Optics  Laboratory 
Fort  Belvoir,  Virginia  U.S. A. 

SUMMARY 


The  procedures  used  and  problems  encountered  in  the  preliminary  evaluation  of  a  proto¬ 
type  automatic  target  cuer  which  operates  on  video  imagery  are  presented.  Performance 
measures  are  defined  and  their  use  and  limitations  described.  An  overview  is  also 
given  of  the  operation,  training,  and  field  test  scenario  used  to  test  this  target  cuer. 
Results  of  cuer  performance  are  presented  as  examples  of  iucr  evaluation  procedures. 

1.  INTRODUCTION 

In  March  of  this  year  Honeywell  Systems  and  Research  Center  delivered  to  our  Laboratory 
the  prototype  automatic  target  screener  (PATS)  for  preliminary  flight  testing.  PATS 
incorporates  many  of  the  algorithms  discussed  in  the  previous  paper  (Aggarwal,  1980), 
and  some  details  will  be  provided  here.  However,  this  paper  primarily  describes  the 
evaluation  of  automatic  cuer  performance.  The  main  objective  is  to  provide  insight  into 
the  complexity  of  this  task.  By  stimulating  thinking  on  this  task  now,  it  is  hoped  that 
evaluation  procedures  will  be  established  which  will  allow  the  NATO  forces  to  ask  for 
and  get  the  image  processing  capability  needed  in  the  future.  This  paper  hopes  to 
achieve  this  goal  by  presenting  examples  from  the  preliminary  testing  of  PATS. 

2.  PROTOTYPE  AUTOMATIC  TARGET  SCREENER  (PATS) 

The  PATS  hardware  is  shown  in  figure  1.  The  main  unit  is  60cm  long,  20cm  high,  and 
25cm  wide.  With  the  small  control  box,  shock  mounts,  and  cabling,  its  mass  is  28kg. 

It  requires  about  200W  for  operation.  It  can  operate  on  either  525  or  875  line,  2:1 
interlaced,  60  field  per  second  video.  All  operations  are  performed  on  a  single  video 
field.  When  processing  of  that  field  is  completed,  another  field  is  processed.  Because 
of  this,  the  processing  time  beyond  the  preprocessor  stage  is  image  dependent;  PATS 
occasionally  processes  consecutive  odd  or  even  fields,  but  typically  processes  only 
every  sixth  field. 

The  functions  performed  in  PATS  are  shown  in  figure  2.  The  preprocessing  section  is 
primarily  used  to  provide  maximum  information  transfer  from  the  sensor  to  the  processor 
by  controlling  global  gain  and  bias  settings  of  the  sensor.  However,  operation  with 
an  AC  coupled  detector  array  imposes  a  need  for  DC  restoration,  so  this  is  also  in¬ 
cluded.  In  addition,  the  preprocessing  section  performs  local  area  gain  and  brightness 
control  ( LAGBC)  for  improving  the  imagery  displayed  to  the  operator. 

After  preprocessing,  the  image  is  segmented  into  objects  of  interest.  The  first  step 
in  this  process  is  the  formation  of  intervals  of  interest  on  individual  scan  lines. 

These  intervals  are  formed  by  the  matching  of  adaptively  thresholded  bright  and  edge 
signals  for  up  to  about  twenty  intervals  per  line.  The  resulting  intervals  are  corre¬ 
lated  line  to  line  in  bins  which  are  filtered  to  form  the  objects  of  interest. 

Next,  simple  features  are  calculated  for  the  objects  of  interest.  This  process  actually 
begins  during  the  field  scan,  and  these  features  (such  as  interval  position  and  bright 
count)  are  stored  in  the  data  memory.  The  features  are  used  to  discriminate  between 
clutter  objects  and  suspected  targets  by  a  series  of  tests  comprising  the  clutter  re¬ 
jection  classifier  stage. 

More  complex  features  (such  as  high  order  intensity  moments)  are  then  calculated  on  the 
remaining  objects.  At  this  stage  there  can  be  no  more  than  about  twenty  objects,  or 
processing  time  becomes  too  long  to  do  object  tracking  in  dynamic  imagery.  The  complex 
features  of  these  objects  are  then  compared  to  features  of  prototypes  of  up  to  five 
classes  that  have  been  stored  in  the  memory.  Each  object  is  classified  based  on  a 
majority  of  nearest  neighbor  prototypes.  The  object  classes  are  easily  changed  by 
replacing  the  features,  which  reside  in  eraseable,  programmable,  read  only  memory 
(EPROM),  with  new  prototype  features.  The  prototype  capacity  is  limited  to  about 
1000  for  realtime  operation. 

After  the  classification  decisions  are  reached  on  all  objects  in  the  field  being  pro¬ 
cessed,  the  results  are  stored,  and  a  new  field  is  processed  through  the  same  procedure 
generating  another  set  of  classified  objects.  The  interframe  analysis  section  does  a 
processed  field  to  processed  field  correlation  of  the  classified  objects  and  keeps 
track  of  the  string  of  classification  decisions  made  on  each  object.  A  multi-frame 
classification  decision  for  each  object  is  then  made  based  on  the  last  five  to  fifteen 
single  field  decisions.  A  symbol  for  each  object  is  then  generated  based  on  the  multi¬ 
frame  classification  and  added  to  the  video  at  the  location  of  the  last  segmentation  of 
each  object  classified. 

3.  PATS  TRAINING 

The  PATS  hardware  can  operate  on  standard  video  from  any  sensor.  However,  PATS  must 
first  be  trained  on  the  sensor's  output.  The  sensor  chosen  to  provide  PATS’  first 
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training  imagery  was  the  common  module,  thermal  imaging  system  called  LOHTADS  (Light 
Observation  Helicopter  Mounted  Target  Acquisition/Designation  System).  Since  this 
system  can  provide  range  information  to  the  center  of  the  field  of  view,  PATS'  clutter 
rejection  stage  was  trained  to  use  this  information  in  addition  to  the  intensity  in¬ 
form  it  ion.  A  limited  computer  simulation  was  done  by  Honeywell  on  digitized  images  to 
select  the  initial  features  to  be  incorporated  in  the  software.  Once  the  hardware  was 
fabricated,  it  was  used  to  do  this  process  from  analog  videotapes.  The  edge  and  bright 
thresholds  for  generating  intervals  are  adjustable  in  the  hardware  and  are  set  to 
optimize  the  segmentation.  The  clutter  rejection  stage  is  programmable  with  the  micro¬ 
code  residing  in  EPROM.  Althouqh  the  microcode  can  be  changed  in  a  matter  of  minutes, 
large  chanqes  require  a  time  consuming  checkout,  and  even  a  simple  change,  like  adjusting 
the  value  of  a  limit  for  an  area  test,  requires  several  hours  of  analysis  from  different 
videotape  inputs  to  assess  its  overall  impact  on  clutter  rejection  performance. 

Training  the  dassif ier  requires  only  a  matter  of  hours  once  the  prototypes  have  been 
selected.  But  once  again,  choosing  the  prototypes  which  give  a  representative  sample 
of  the  existing  data  is  a  difficult  and  time  consuming  task.  The  initial  classes 
chosen  for  PATS  were  M60  tank,  M113  armored  personnel  carrier,  M35  flatbed  2*j  ton 
truck,  and  M151  jeep.  Videotapes  from  LOHTADS  of  these  four  classes  in  all  aspects  and 
with  engines  running  were  used  to  train  the  classifier.  Since  PATS  is  a  statistical 
classifier,  all  training  and  testing  was  done  on  targets  with  no  partial  obscuration. 

To  date,  all  training  tapes  have  been  generated  at  Fort  A.P.  Hill,  Virginia. 

4.  FLIGHT  TEST  SCENARIO 

PATS  was  mounted  in  the  rear  comp,  tment  of  a  light  observation  helicopter  (OH-6)  in 
March  for  preliminary  field  testing.  Along  side  PATS  were  two  videotape  recorders, 
one  to  record  the  input  to  PATS,  the  other  to  record  the  PATS  output.  The  test  was 
performed  at  Fort  A.P.  Hill,  Virginia.  Most  of  the  fort  is  hilly  (less  than  100m)  and 
forested  with  a  mixture  of  deciduous  and  coniferous  trees.  The  target  vehicles  were 
always  located  in  the  drop  zone,  a  cleared  area  about  1km  by  4km.  Most  of  the  drop 
zone  is  covered  with  small  shrubs,  but  there  are  bare  spots  and  several  clumps  of  small 
trees.  There  are  portions  of  paved  and  gravel  roads  through  the  zone,  but  more  pre¬ 
valent  are  the  many  dirt  trails.  For  this  test,  most  of  the  vehicles  were  stationary; 
however,  the  vehicles  were  always  exercised  for  at  least  20  minutes  prior  to  collecting 
data  and  their  engines  'eft  running  during  the  test.  This  was  done  because  all  training 
data  was  of  hot  vehicle  .  Videotapes  were  generated  as  the  helicopter  flew  from 
several  kilometers  away  to  within  0.5km  of  the  targets.  The  altitude  of  the  helicopter 
never  exceeded  200m  above  ground  level.  The  followina  data  were  collected  during  the 
test : 


a.  875  line  videotapes  of  both  LOHTADS  and  PATS  output  with  range  and  time  on 
audio  channels. 

b.  Weather  data  (temperature,  humidity,  wind  velocity,  barometric  pressure  and 
cloud  cover). 

c.  Aircraft  heading  and  altitude. 

d.  Target  vehicle  information  (location,  orientation,  and  radiant  temperature). 

The  initial  test  plan  was  to  reproduce  as  closely  as  possible  the  imagery  used  in 
training  PATS.  It  became  clear  the  very  first  day  that  this  task  could  not  be  achieved 
to  our  satisfaction  because  major  deferences  between  the  two  image  sets  were  being 
caused  by  different  environmental  conditions. 

5 .  PERFORMANCE  MEASURES 

With  the  appearance  of  automatic  image  processing  equipment  for  military  applications, 
it  becomes  necessary  to  define  quantifiable  performance  measures  that  can  be  used  to 
assess  approaches  and  equipment.  It  is  important  to  decide  if  a  cuer  is  good  enough 
for  a  specific  task,  to  choose  between  one  processor  and  another,  to  compare  men  to 
machines,  and  to  measure  development  progress  on  specific  hardware  and  progress  of  the 
technology  in  general.  In  this  section  the  techniques  used  to  describe  cuer  performance 
are  presented.  The  actual  parameters  used  in  a  general  performance  measure  are  defined 
for  both  detection  and  recognition  tasks.  This  measure  is  then  used  with  examples  from 
PATS  testing  to  describe  absolute  cuer  performance  on  a  given  data  set,  to  make  com¬ 
parisons  with  human  observers,  and  to  measure  cuer  improvements.  A  proposed  method  of 
comparing  cuers  is  discussed  in  the  last  portion  of  this  section. 

5.1  PARAMETERS  USED  TO  EVALUATE  PERFORMANCE 

Most  measures  of  performance  are  defined  for  a  specific  mission  and  are  only  useful 
when  trying  to  evaluate  an  automated  system  for  that  particular  application.  For 
example,  an  effectiveness,  E,  can  be  calculated  for  a  system  on  a  given  set  of  images 
of  size  X  by  Y  using 
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where  PjXy  is  the  probability  of  an  object  appearing  which  belongs  to  class  j  at  location 
xy;  P(i,j)  is  the  conditional  probability  that  if  an  object  belonging  to  class  j  appears, 
that  object  is  called  a  member  of  class  i;  and  n  is  the  number  of  classes  of  interest. 
(The  i  and  j  are  summed  to  n+1  to  include  the  class  of  no  interest  objects.)  The 
coefficients  in  the  sum  are  positive  for  correct  decisions  (i  =  j)  and  negative  for  in¬ 
correct  decisions  (i^j)  with  the  magnitudes  assigned  based  on  the  relative  importance 
of  each  decision  for  the  particular  mission.  (For  objects  of  no  interest,  P(n+l,n+l) 
does  not  need  to  be  measured  because  A*  .,  _,.=0).  The  best  overall  performance  is 
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achieved  when  E  is  maximized. 

Other  possible  measures  can  be  defined  which  are  less  mission  dependent  than  E.  For 
example,  the  probability  of  not  detecting  a  target  in  class  j  (assigning  it  to  the  non¬ 
target  class)  is  simply  P(n+l,j).  The  probability  of  target  recognition,  PR,  is  the 
number  of  targets  correctly  classified  divided  by  the  total  number  of  targets  present. 

The  probability  of  incorrect  target  classification,  PIf  is  just  the  number  of  targets 
incorrectly  classified  divided  by  the  total  number  of  targets  present.  These  are  more 
general  measures  than  E  because  they  combine  many  of  the  terms  used  in  E.  Therefore, 
they  cannot  be  used  to  predict  the  performance  of  a  system  in  a  particular  scenario 
as  well  as  E  can.  However,  they  are  more  useful  than  E  at  this  stage  of  cuer  develop¬ 
ment  because  they  can  be  more  easily  used  to  describe  general  performance  and  to  make 
comparisons . 

Unfortunately,  no  one  of  these  general  measures  is  very  useful  by  itself  because  rarely 
will  only  one  change  when  the  cuer  or  its  algorithm  changes.  For  example,  increasing 
PR  would  generally  be  considered  good,  but  even  doubling  the  value  of  PR  is  probably 
bad  if  doing  so  also  means  that  Pj  is  tripled.  Reducing  PR  by  10%  is  probably  very 
worthwhile  if  Pj  can  be  cut  in  half.  The  ratio  of  Pr  to  Pi  is  a  better  general  perfor¬ 
mance  measure  than  either  alone;  but  this  is  also  inadequate  because  it  does  not  in¬ 
clude  a  measure  of  the  difficulty  of  the  task  being  performed  on  the  image  set.  For 
example,  the  task  of  distinguishing  between  tracked  vehicles  and  wheeled  vehicles 
(a  two  class  task)  is  an  easier  one  than  the  task  of  deciding  among  M60  tank,  M48  tank, 
M551  light  tank,  M113  APC,  M35  truck,  and  M151  jeep  (a  six  class  task)  in  the  same  image 
set.  Therefore  any  performance  measure  should  be  a  function  of  n,  the  number  of  classes. 
In  addition,  it  is  important  to  note  that  any  image  set  has  a  variety  of  difficulty  of 
the  recognition  tasks  within  it  for  even  a  fixed  n;  classifying  100%  of  the  targets  is 
more  difficult  than  classifying  10%,  while  refusing  to  decide  on  the  classification 
of  the  other  90%.  This  difficulty  is  at  least  partially  proportional  to  Pr.  There  is 
not  a  simple  yet  satisfying  way  to  take  these  two  task  difficulties  into  account,  but 
one  way  is  to  multiply  the  Pr/Pj  ratio  by  n2PR.  One  further  item  that  must  be  included 
in  the  measure  is  the  time  allowed  to  make  the  classifications.  If  performing  the 
operations  twice  as  fast  is  considered  about  twice  as  good,  the  inverse  of  the  average 
processing  time  can  be  simply  included  as  a  multiplicative  factor.  Taking  the  above 
considerations  into  account,  the  following  recognition  performance  measure  was  chosen 
for  the  initial  evaluation  of  PATS: 

pM  =  (nPR)  2 , 

where  t  is  the  time  interval  upon  which  classification  decisions  are  based.  (In  PATS 
testing  t  is  the  number  of  fields  used  in  the  interframe  analysis  times  the  average 
time  between  field  grabs.)  This  pM  has  a  range  from  zero  (bad)  to  infinity  (perfect). 
Table  1  shows  the  range  of  values  expected  from  cuers  over  the  next  few  years  along 
with  subjective  ratings. 


Table  1:  Illustration  of  the  range  of  pM 

Subjective  Number  of  Probability  of  Prob.  of  Incorr.  Time  Required  Performance 

Rating  Classes,  n  Recognition,  PR  Classification,  Pj  for  Decision,  t  Measure,  pM 

bad  2  0.2  0.2  2.0s  0.1s_1 

fair  3  0.3  0.1  1.0  0.9 

good  4  0.4  0.05  0.5  6.4 

excellent  5  0.5  0.025  0.25  42 


If  the  above  pM  is  to  be  used  to  describe  detection,  then  PR  must  be  replaced  by  the 
probability  of  target  detection,  PD,  the  number  of  targets  detected  divided  by  the  total 
number  of  targets.  But  in  this  case  there  is  no  longer  a  need  to  include  a  measure  of 
the  difficulty  of  the  task  the  cuer  is  doing  on  a  given  data  set  because  the  task  is 
fixed:  the  cuer  must  always  choose  between  target  and  non-target  for  each  position  on 
every  image.  Therefore,  the  numerator  need  only  consist  of  4Pp.  (n2=4  is  retained  to 
make  the  subjective  ratings  in  table  1  still  appropriate  when  discussing  detection) . 

Pj,  however,  must  be  replaced  by  a  measure  of  both  targets  called  non-targets  (missed 
targets)  and  non-targets  called  targets  (false  alarms) .  Since  this  measure  is  only 
to  be  used  over  a  particular  data  set,  it  is  not  necessary  to  define  a  false  alarm 
rate.  Pi  can  be  replaced  by  the  sum  of  the  probability  of  missing  a  target,  Pm=d-PD), 
and  the  number  of  false  alarms  per  target,  F.  The  time  is  inserted  in  the  same  manner 
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as  before  yielding  a  performance  measure  for  detection. 


PMd  =  4PD 

t  • 

The  performance  measures  just  described  are  probably  not  the  best  to  use;  there  are 
several  problems  with  them.  The  measures  could  become  unrealistically  large  as  the 
denominators  go  to  zero.  However,  this  occurrence  probably  indicates  a  lack  of  an 
appropriate  number  of  measured  samples  as  much  as  a  fault  with  the  pM.  Another  problem 
is  that  when  n  is  larger  than  two,  the  classes  are  never  going  to  be  equally  easy  to 
separate.  Despite  these  and  other  reservations,  these  measures  are  the  best  we  have 
at  the  moment  so  they  are  being  used  to  evaluate  cuer  performance. 

5.2  ABSOLUTE  PERFORMANCE  MEASURES 

The  general  pM's  described  above  can  be  used  to  rate  the  performance  of  a  cuer  on  any 
given  set  of  data.  Table  2  shows  the  results  of  such  measurements  for  five  data  sets 
collected  for  similar  scenarios  with  four  targets  located  in  the  center  of  the  drop 
zone  at  Fort  A.P.  Hill,  Virginia.  Since  the  cuer  operation  was  not  different  between 
these  scenarios,  all  that  the  pM  values  show  is  the  differences  between  data  sets  used 
in  testing.  To  establish  an  absolute  value  for  any  performance  measure  that  can  be 
generalized  to  the  real  world  performance,  the  distribution  of  samples  in  the  data  set 
must  accurately  reflect  the  variety  in  the  real  world.  Acquiring  sufficient  thermal 
imagery  to  form  this  data  base  is  clearly  impractical.  Simply  specifying  what  should 
be  included  is  beyond  the  state  of  the  art  in  image  description. 


Table  2:  Absolute  cuer  performance  (PATS) 


Date 

Time 

t 

£r 

h 

pM 

F 

.  P^d 

15  Nov 

79 

1530 

hrs . 

1.5s 

.  16 

.30 

0.9s"1 

.57 

.01 

3.5s 

12  Mar 

80 

1530 

hrs . 

4 

.05 

.06 

.15 

.44 

.26 

.5 

14  Mar 

80 

1630 

hrs. 

5 

.03 

.10 

.02 

.58 

.82 

.4 

15  Mar 

80 

1600 

hrs. 

5 

.04 

.03 

.17 

.45 

.38 

.4 

15  Mar 

80 

2000 

hrs. 

6 

.07 

.15 

.09 

.56 

.64 

.3 

An  alternative  to  forming  this  massive  data  base  is  the  use  of  theoretical  models 
(verified  over  a  limited  data  base)  to  predict  the  full  spectrum  of  variety  based  on 
measurements  other  than  thermal  imagery.  Target  orientation,  target  temperature,  time, 
season,  solar  insolation,  humidity,  and  other  parameters  would  be  used  with  this  model 
to  predict  the  "inherent-detectability  or  recognizability ”  of  a  target  in  a  given 
thermal  image  set;  then  the  results  of  an  absolute  pM  on  a  cuer  could  be  compared  to 
the  "inherent  characteristic"  and  the  cuer  performance  rated.  Unfortunately,  the 
current  status  of  thermal  image  modeling  is  not  sufficient  to  predict  this  "inherent 
characteristic"  from  a  set  of  measured  parameters.  This  situation  is  not  expected 
to  change  in  the  near  future  so  it  is  important  that  another  alternative  be  formulated 
to  compare  cuer  absolute  performance  on  different  data  bases. 

5.3  COMPARISON  WITH  OBSERVERS 

One  way  to  gauge  the  difficulty  of  the  detection  or  recognition  tasks  assigned  to  cuers 
on  specific  data  sets  is  to  present  the  imagery  to  an  average  human  observer.  The 
results  of  the  observer's  performance  can  then  be  compared  to  the  cuer's.  This  procedure 
is  filled  with  pitfalls,  but  to  date  it  has  been  the  most  reliable  method  of  assessing 
cuer  performance.  The  average  observer,  of  course,  is  a  myth.  Instead,  a  group  of 
observers  is  used  to  smooth  out  the  person  to  person  and  day  to  day  variation  of 
individuals.  This  group  must  also  be  checked  periodically  on  the  same  data  to  determine 
group  drift.  In  addition,  it  is  important  that  tests  given  to  the  observers  do  not 
allow  them  to  learn  during  the  testing.  Since  this  is  next  to  impossible,  it  is 
important  to  design  the  tests  so  that  results  from  observers  who  do  learn  during  the 
testing  can  be  eliminated  from  the  group.  For  example,  one  set  of  imagery  shown  to 
the  observers  for  measuring  observer  pM  contained  target  vehicles  that  were  always 
oriented  in  the  same  direction  as  each  other  within  each  image.  One  observer  dis¬ 
covered  this  during  testing.  Knowing  the  orientation  of  all  vehicles,  once  the 
orientation  of  one  vehicle  could  be  determined,  allowed  this  observer  to  perform 
better  than  if  he  had  not  learned  this  information.  The  primary  method  used  to 
determine  observer  learning  is  to  repeat  the  first  portion  of  the  test  as  if  it  is 
simply  a  continuation  of  the  original  test.  If  an  observer  improves  his  performance 
on  that  portion  the  second  time,  his  results  must  be  discarded. 

Handling  the  time  an  observer  is  allowed  to  view  and  make  decisions  on  the  test  imagery 
is  also  an  issue  that  must  be  decided  carefully.  For  most  of  the  comparisons  that  have 
been  made  to  date,  the  observers  viewed  only  a  few  seconds  of  videotape  and  were  forced 
to  respond  during  the  viewing  in  an  effort  to  force  a  response  time  close  to  PATS'. 
However,  another  test  has  been  set  up  to  task  load  the  observers  during  the  viewing 
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of  thirty  seconds  of  tape.  This  more  nearly  simulates  the  field  environment  but  is 
difficult  to  use  for  a  direct  comparison  with  the  cuer.  It  is  probably  more  useful 
in  comparing  operator  performance  with  and  without  a  cuer,  something  we  hope  to  do 
shortly. 

Given  the  above  limitations,  tables  3  and  4  show  typical  results  of  these  types  of 
comparisons  for  representative  scenarios  during  the  preliminary  flight  test  of  PATS 
in  March.  All  the  comparisons  shown  use  night  imagery  because  the  daytime  imagery 
that  would  be  useful  for  this  comparison  was  too  cluttered  for  either  the  observers 
or  PATS  to  perform  at  accurately  measurable  levels. 


Table  3:  Recognition  performance  of  observers  compared  to  PATS 


Date 

Time 

Obs . 

n 

Obs . 
t 

Obs . 
PR 

Obs , 
PI 

Obs . 
pM 

PATS 

n 

PATS 

t 

PATS 

PR 

PATS 

PI 

PATS 

pM 

Obs .  pM 
PATS  pM 

14  Mar 

80 

2230 

hrs 

4 

5s 

.21 

.42 

0.3s- 

1  4 

5s 

.06 

.06 

0.2s- 

1  1.7 

15  Mar 

80* 

1945 

hrs 

4 

5 

.49 

.22 

3.5* 

4 

5 

.07 

.15 

0.11 

32* 

*For  this  test  the  observers'  task  differed  from  PATS'  because  the  observers  knew  before 
their  five  second  observation  period  the  target  array  pattern  and  the  array's  location. 


Table  4 : 

Detection 

performance 

of  observers 

compared 

to  PATS 

Date 

Time 

Obs . 
t 

Obs . 
PD 

Obs . 
F 

Obs . 
PMd 

PATS 

t 

PATS 

PD 

PATS 

F 

PATS 

PMd 

Obs .  pM 
PATS  pM( 

14 

Mar 

80 

2230 

hrs 

5s 

.89 

.05 

4.5s- 

1  5s 

.46 

.34 

.42s 

-1  11 

15 

Mar 

80 

2130 

hrs 

5 

.85 

.11 

2.6 

5 

.15 

1.5 

.05 

51 

5.4  MEASURES  OF  CUER  IMPROVEMENT 

Like  all  newly  emerging  technologies,  cuer  performance  will  improve  rapidly  with  time. 

It  therefore  is  necessary  to  measure  this  improvement.  Because  of  the  reasons  described 
in  section  5.2,  no  absolute  measure  of  performance  can  be  used  for  this  task  when  trying 
to  compare  results  from  two  separate  field  tests.  There  are  two  avenues  which  can  be 
pursued.  First,  one  could  compare  cuer  operation  at  two  different  times  on  the  identical 
input  data.  It  is  important  when  trying  to  assess  general  improvement  to  use  different 
test  data  than  training  data.  This  is  always  impossible,  but  one  strives  to  do  the  best 
he  can  with  the  available  data.  This  was  done  with  PATS  in  July  1980  by  inputing  video¬ 
tapes  from  November  1979  and  March  1980  with  PATS  on  the  bench.  It  was  also  necessary 
before  drawing  conclusions  from  this  test  to  show  that  PATS  operated  the  same  on  the 
bench  as  it  did  in  the  aircraft.  After  having  shown  this,  PATS’  detection  performance 
was  compared  before  and  after  some  major  hardware  and  software  changes.  The  ratio  of 
July  to  March  performance  showed  an  improvement  by  a  factor  of  three.  This  was  consid¬ 
ered  significant  enough  to  attempt  another  flight  test  in  August  1980.  Results  of  that 
test  will  be  forthcoming.  In  early  August,  just  prior  to  the  flight  test,  PATS  was 
tested  on  a  tape  from  which  no  training  samples  had  been  taken.  As  shown  in  table  5, 

PATS  detection  improved  by  a  factor  of  five. 


Table  5:  Measure  of  PATS  improvement 


Date  of  Test 

t 

PD 

F 

PMd 

1  Aug  80 

3s 

.79 

.37 

1 . 8s- 

15  Mar  80 

6 

.56 

.64 

.35 

A  second  method  of  measuring  cuer  improvement  can  be  used  if  one  has  a  measure  of  the 
difficulty  of  the  task.  As  described  in  section  5.3,  observers  can  be  used  to  provide 
this  standard.  How  well  the  cuer  had  done  compared  to  an  average  observer  on  one 
data  set  can  then  be  compared  to  how  well  the  improved  cuer  does  compared  to  an  average 
observer  on  the  new  data  set.  This  method  will  be  used  to  report  the  results  of  the 
August  tests  as  they  become  available. 

5.5  COMPARISON  BETWEEN  CUERS 

As  other  realtime  cuers  are  developed  and  field  tested,  it  will  become  essential  to 
compare  the  strengths  and  weaknesses  of  each.  The  ideal  procedure  would  be  to  compare 
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them  on  the  same  input  data,  but  this  will  frequently  be  impossible.  For  example, 
Northrop  is  fabricating  a  cuer  which  will  be  ready  for  preliminary  testing  next  year. 
This  cuer  is  also  designed  to  operate  on  LOHTADS  thermal  imagery,  but  it  incorporates 
a  digital  scan  converter  in  the  front  end.  It  cannot  operate  on  the  video  formated 
imagery  required  for  PATS  operation.  It  might  be  possible  to  modify  the  Northrop 
cuer  just  to  furnish  a  video  output  that  could  be  fed  to  PATS,  but  even  this  comparison 
would  not  be  perfectly  valid  because  PATS  would  be  forced  to  operate  on  video  from  an 
imaging  system  whose  gain  and  bias  it  is  not  controlling.  Most  comparisons  between 
cuers  will  not  even  be  as  simple  as  this  one  since  they  will  not  be  able  to  operate 
from  the  same  sensor. 

One  way  to  make  the  comparison  is  through  the  average  observer.  The  observer  gives  a 
measure  of  the  difficulty  of  the  task,  and  cuers  are  compared  by  measuring  how  much 
better  than  observers  they  do.  Once  again,  the  more  similar  the  test  data,  the  more 
confidence  there  will  be  in  the  result.  If  cuers  evolve  to  do  different  tasks,  this 
comparison  will  never  be  made  properly  because  the  two  cuers  will  not  be  trained  with 
similar  data. 

6.0  CONCLUSIONS 

Procedures  are  being  developed  and  used  to  evaluate  realtime  target  cuers.  These 
procedures  need  to  be  greatly  refined  and  improved  in  order  to  meet  the  rapidly  growing 
need  for  quantitative  cuer  assessment. 
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FIGURE  1:  PATS  HARDWARE  WITH  THE  TOP  COVER  REMOVED. 

THE  SMALL  CONTROL  BOX  IS  MOUNTED  BETWEEN 
THE  TWO  SEATS  IN  THE  HELICOPTER. 
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FIGURE  2:  BLOCK  DIAGRAM  OF  PATS'  OPERATION. 
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DIGITAL  IMAGE  PROCESSING  FOR  GROUND  TARGET  DETECTION,  IDENTIFICATION  AND  LOCATION 

George  R.  Hughes 
RADC/IRR 

Grlfflss  AFB ,  New  York 
ABSTRACT/SUMMARY 

This  paper  discusses  RADC/USAF  Research  and  Development  activities  in  the  areas  of 
image  processing  and  target  identification  relating  identification  and  location  of 
ground  targets  in  a  tactical  scenario.  The  primary  objective  is  to  reduce  the  re¬ 
connaissance  cycle  from  days/hours  to  minutes  and  seconds  commensurate  with  the  near 
real  time  (NRT)  intelligence  requirements  of  the  tactical  forces.  This  NRT  intel¬ 
ligence  is  essential  to  support  strike  of  increasingly  mobile  enemy  weapon  systems.  In 
currently  fielded  systems  exploitation  of  film  based  reconnaissance  is  extremely  slow, 
greatly  lagging  collection  rates. 

There  are  three  essential  elements  to  a  NRT  tactical  Intelligence  system.  They 
are  NRT  imagery  collection,  NRT  air  to  ground  image  data  link  and  NRT  imagery  exploita¬ 
tion.  The  thrust  of  this  paper  will  be  to  discuss  technology  required  to  support  the 
later,  NRT  imagery  exploitation. 

Technology  intensive  efforts  are  categorized  under  each  of  the  exploitation 
elements  (target  detection,  identification  and  precision  location).  R&D  in  the  area  of 
target  detection  consists  of  exploratory  and  advanced  development  work  units  in  automated 
target  correlation,  automatic  change  detection  and  pipeline  image  processing  for  screen¬ 
ing  probable  target  areas.  Target  identification  R&D  to  be  presented  includes  automatic 
techniques  for  pattern  recognition  as  well  as  semi-automated  techniques  for  aiding  an 
analyst  by  correlating  various  sensor  and  intelligence  inputs  to  permit  target  identi¬ 
fication.  Near  real  time  precision  target  location  techniques  will  include  techniques 
for  locating  imagery  targets  in  a  predefined  precision  photographic  data  base  as  well 
as  techniques  for  performing  location  simultaneously  with  target  identification. 

Summary  and  conclusions  for  this  presentation  discuss  commonality  aspects  of  a 
digital  image  exploitation  system  relating  to  a  "universal"  image  exploitation  system 
and  the  potential  for  its  use  in  NATO.  Potential  areas  for  developing  cooperative  R&D 
programs  in  support  of  this  universal  system  objective  will  be  identified. 

BACKGROUND 


A  primary  mission  of  tactical  Air  Force  is  to  neutralize  enemy  ground  combat 
forces  thru  strike  of  second  echelon  ground  support  forces.  Optical  reconnaissance 
sensors,  using  hardcopy  film  as  a  recording  medium,  have  long  been  employed  to  collect 
intelligence  required  to  identify  and  locate  targets  in  support  of  this  tactical 
interdiction  mission. 

Limitations  of  these  sensors  in  support  of  fast  moving  tactical  combat  operations 
have  been  multi-faceted.  The  primary  limitation  of  optical  sensors  is  daytime,  clear 
weather  operation.  In  the  late  fifties  Imaging  infrared  and  radar  sensors  were  developed. 
Problems  relating  daylight  and  all  weather  tactical  collection  capabilities  were  al¬ 
leviated.  However,  film  based  collection  continued  to  severely  limit  the  timeliness  of 
the  tactical  intelligence  Information  collected.  For  example,  the  time  from  recce  air¬ 
craft  over  target  to  target  intelligence  Included  aircraft  return  to  base,  recce  film 
downloading,  film  processing,  and  target  identification  and  location.  The  time  required 
to  accomplish  these  functions  is  measured  in  hours.  Increasing  mobility  of  tactical 
weapon  systems  dictated  that  methods  be  developed  to  detect,  identify  and  locate 
tactical  ground  targets  minutes,  and  even  seconds  from  collection.  To  accommodate  this 
urgent  requirement,  near  real  time  image  intelligence  systems  were  conceived  and 
research  and  development  objectives  identified.  The  term  near  real  time  (NRT)  in¬ 
telligence  was  coined  and  defined  as  the  process  of  providing  tactical  intelligence  to 
command  and  control  elements  within  minutes  of  recce  aircraft  time  over  target.  To 
achieve  NRT  intelligence,  revolutionary  recce  systems  were  conceived.  However,  initial 
concepts  (1960's)  were  technology  limited.  NRT  systems  of  the  60's  employed  advanced 
all  weather  day/night  sensors  systems  providing  an  image  reconnaissance  collection 
capability  which  was  not  limited  to  fair  weather  or  daylight.  However,  NRT  exploitation 
of  this  data  was  not  possible.  Extremely  high  data  rates  from  imaging  sensors,  limited 
the  recording  media  to  silver  halide  film.  Attendant  chemical  processing  and  film 
handling  problems  extended  the  image  exploitation  cycle  to  hours. 

In  the  Vietnam  War  era  reconnaissance  systems  were  developed  which  employed 
various  types  of  exotic  film  processing  strategies  which  were  designed  to  speed  up  the 
image  exploitation  cycle.  These  systems  met  with  limited  success  primarily  because  of 
the  specialized  procedures  employed  to  process  the  film.  However,  time  constraints 
related  to  analog  film  exploitation  continued.  It  was  apparent  that  technology  was 
required  to  replace  analog  film  as  the  recording  media  to  meet  NRT  time  imagery  in¬ 
telligence  requirements. 
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The  first  NRT  image  reconnaissance  was  developed  and  experimentally  tested  by  RADC 
in  the  late  60's.  The  system  employed  an  infrared  sensor  which  was  simultaneously 
recorded  electrically  on  tape  (eliminating  the  analog  film  record)  and  simultaneously 
displayed  In  the  aircraft  on  a  CRT  display.  Real  time  constraints  coupled  with  limited 
resolution  limited  the  value  of  the  aircraft  display.  The  tape  recorder  was,  however, 
successfully  exploited  at  the  ground  station.  With  the  improved  time  lines  of  this 
system,  numerous  versions  of  systems  designed  to  Improve  timeliness  of  tactical  in¬ 
telligence  were  developed.  Initial  success  with  electrically  recording  infrared 
imagery  lead  to  the  development  of  data  link  systems  to  transmit  analog  electrical 
infrared  and  radar  imagery  directly  to  the  ground  exploitation  system.  This  eliminated 
the  time  required  for  the  collection  aircraft  to  return  to  base  and  download  the  sensor 
record.  The  performance  of  these  NRT  sensor/data  link  systems  continues  to  be  pro¬ 
gressively  improved. 

There  are  three  essential  elements  to  a  NRT  tactical  imagery  intelligence  system. 
They  are  NRT  time  imagery  collection,  NRT  air  to  ground  imagery  transmission  and  NRT 
imagery  exploitation.  Systems  and  technology  to  field  the  first  two  elements  are 
progressing.  The  emergence  of  the  NRT  collection/transmission  technologies  have 
magnified  problems  in  NRT  imagery  exploitation.  Time  required  for  target  detection, 
identification  and  location  (NRT  imagery  exploitation)  represents  a  serious  deficiency 
and  limit  on  the  overall  effectiveness  of  NRT  image  intelligence  systems. 

NEAR  REAL  TIME  EXPLOITATION  TECHNIQUES 


NRT  Imagery  exploitation  (for  the  purpose  of  this  paper)  is  defined  as  the  process 
of  detecting.  Identifying  and  locating  ground  targets  within  five  minutes  of  receipt  of 
digital  Imagery  from  the  collector. 

The  requirement  for  more  rapid  imagery  exploitation  response  has  resulted  in  a 
major  thrust  of  research  and  development  programs  in  the  direction  of  aiding  the  image 
interpreter.  NRT  image  exploitation  presents  two  major  inter-related  problems  to  the 
system  designer.  They  are  the  high  system  data  rate  required  to  process  digital  Imagery 
and  the  time  allowed  to  accomplish  the  digital  image  exploitation.  The  time  element  is 
primary.  The  system  must  be  designed  to  optimize  the  time  element.  When  and  if  a 
collection  platform  is  directed  to  a  single  target  or  a  limited  number  of  targets,  data 
rate  is  relatively  low  and  the  design  of  a  NRT  exploitation  system  is  relatively  straight 
forward.  The  real  world  situation,  however,  has  shown  that  NRT  collection  systems 
collect  large  volumes  of  ground  coverage  necessitating  extremely  high  data  rates.  For 
example,  a  two  minute  forward  locking  infrared  (FLIR)  mission  covering  four  target 
sites  may  include  up  to  3600  frames  of  data.  Imaging  radar  systems  can  collect  over 
4000  square  miles  of  ground  coverage  an  hour  while  current  manual  exploitation  averages 
200  square  miles/hour  for  the  average  interpreter  using  a  normal  light  table  and  range 
to  higher  values  for  highly  skilled  interpreters  using  specialized  interpretation 
equipment.  To  further  compound  the  problem,  more  than  one  collection  aircraft  may  be 
data  linking  to  a  single  exploitation  station.  It  is,  therefore,  unreasonable  to  even 
consider  brute  force  (image  by  Image)  processing  of  NRT  image  data.  Traditional 
approaches  which  use  manual  exploitation  techniques  cannot  accommodate  the  NRT  imagery 
data  rates.  Methods  must  be  applied  to  filter  and  sort  NRT  imagery  so  that  the  target 
area  is  detected  and  extraneous  data  dlsreguarded. 

Techniques  for  filtering  Imagery  may  be  generally  sorted  into  two  major  categories. 
The  two  categories  are: 

.  Imagery/Intelligence  Correlation 

.  Automatic  Image  Processing 

Imagery/Intelligence  Correlation  subsystems  involve  the  correlation  of  Intel¬ 
ligence  from  other  sources  and  sensors  to  Identify  areas  of  Interest  on  NRT  imagery. 

In  most  situations  the  Indications  from  other  sources  are  adequate  to  detect  gross 
areas  of  activity.  However,  the  nature  of  the  other  sensors  is  such  that  target 
identification  and  precise  location  data  must  be  extracted  from  image  source  data. 

The  Advanced  Sensor  Exploitation  (ASE)  program  Is  a  USAF  advanced  development 
program  directed  at  the  requirement  for  technology  development  in  support  of  Imagery 
intelligence  correlation.  The  ASE  program  addresses  the  requirement  for  the  rapid 
correlation,  integration  and  display  of  (near)  real-time  combat  sensor  information 
pertaining  to  high  priority  mobile  ground  targets. 

The  overall  program  objectives  are  to  develop,  demonstrate  and  validate  automated 
processing  techniques/technology  to  Improve  the  timeliness,  completeness  and  accuracy 
of  dynamic  ground  target  information  for  tactical  command  and  control  elements.  The 
developments  focus  on  the  use  of  advanced  (near)  real-time  standoff  sensor/strike 
systems  and  the  technology  required  to  effectively  and  optimally  exploit  these  re¬ 
sources  in  a  collective  manner  in  support  of  the  current  operations  function  of  the 
tactical  Air  Force. 

Due  to  the  evolutionary  and  rapid  developments  in  sensor  technology  and  concepts 
in  this  area  of  (near)  real-time  combat  information  processing  the  program  consists  of 
two  major  phases.  The  first  phase  is  concentrating  on  the  technology  of  automated  data 
handling,  correlation,  tracking  and  display  of  the  dynamic  ground  target  situation. 
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The  initial  demonstration  will  utilize  simulated  sensor  inputs  collected  against  a 
simulated  enemy  force.  The  successful  demonstration  of  on-line  exploitation  of  the 
simulated  sensors  will  be  the  basis  for  the  planned  second  phase  of  the  program  which 
will  transition  the  technology  into  a  field  model  for  demonstration  with  live  sensors 
during  FY8V85  time  frame. 

ASE  processes  and  correlates  NRT  sensor  data  from  advanced  outyear  airborne  sensor 
systems  that  are  currently  under  development.  Live  sensor  data  from  these  advanced 
sensors  is  not  readily  available  at  this  time.  The  ASE  approach  is  to  simulate  these 
sensors.  Sensor  systems  to  be  simulated  are  from  four  generic  categories:  wideband 
emitter  detector;  moving  target  indicator;  narrowband  emitter  detector;  and  an  advanced 
Synthetic  Aperture  Radar  Sensor.  A  battlefield  scenario  is  generated  by  feeding 
laboratory  generated  scenario  Information  to  the  sensor  models.  Activities  focus  on 
developing  the  high  speed  automated  data  handling  and  display  capabilities  necessary  to 
fully  exploit  advanced  sensor  systems  in  a  complementary  fashion;  the  development  of 
correlation  software  to  present  a  composite  picture  of  second  echelon  mobile  ground 
targets  from  multi-sensor  data  inputs;  demonstrating  continuous  tracking  of  these 
mobile  ground  targets;  developing  automated  sensor  cueing  techniques  to  allow  handling 
off  of  a  group  of  targets  from  one  sensor  system  to  another  when  tracking  has  been  or 
will  be  lost  by  a  particular  sensor;  evaluating  the  digital  cartographic  data  or 
electronic  map  technology,  required  to  support  advanced  sensor  data  exploitation;  and 
utlimately,  developing  the  capability  to  maintain  and  update  a  near  real-time  dynamic 
ground  order  of  battle.  The  resulting  output  of  dynamic  ..ground  target  information  from 
the  technology  developed  in  ASE  is  expected  to  provide  C"5!  nodes  with  a  composite 
picture  of  the  second  echelon  ground  environment  at  a  level  of  detail  that  will  favorably 
Impact  target  nominations  and  the  interdiction  process. 

In  summary,  the  objective  of  the  ASE  demonstration  effort  is  to  develop  the 
automated  data  processing  technology  to: 

(1)  Automatically  correlate  four  generic  high  capacity  sensors:  MTI,  narrow  and 
wide  band  emitter  detectors  and  an  NRT  imaging  source. 

(2)  Demonstrate  bi-directional  interface  with  the  sensors  and  tactical  command, 
control  and  intelligence  elements. 

(3)  Demonstrate  automated  cueing  and  tracking  of  high  priority  mobile  targets 
over  a  200km  x  200km  geographic  area. 

(4)  Demonstrate  display  of  dynamic  ground  target  situation  for  tactical  battle 
field  surveillance  and  support  to  current  operation  activities. 

(5)  Demonstrate  application  of  automated  Imagery/Intelligence  correlation  tech¬ 
niques  to  filter  NRT  imagery. 

This  demonstration  will  be  conducted  utilizing  a  simulated  divisional  ground  force 
operating  over  a  200  km  x  200  km  geographic  area.  A  set  of  advanced  sensor  models  will 
provide  high  volume  sensor  information  to  demonstrate  the  above  ASE  capabilities.  The 
results  of  this  effort  shall  be  a  quantified  processing  baseline  for  exploiting  and 
integrating  the  high  volume  information  available  from  these  advanced  sensors. 

This  advanced  development  effort  is  an  evolutionary  step  in  the  development  of 
automated  data  handling,  correlation  and  display  of  (near)  real-time  mulitsource  combat 
information. 

The  ASE  systems  multisource  source  correlation  capability  will  provide  an  efficient 
method  of  filtering  NRT  Imagery  at  the  location  and  or  situation  where  other  sensor 
source  data  is  available  and  practical.  In  remote  areas  where  NRT  image  systems  are 
deployed  in  an  austere  environment  and  where  complex  computer  processing  capabilities 
as  are  typified  by  an  ASE  capability  are  not  available;  alternate  means  for  filtering 
NRT  imagery  must  be  employed.  Since  the  use  of  a  large  number  of  interpreters  to 
screen  and  detect  targets  is  not  an  acceptable  alternative;  the  only  alternative  is  to 
employ  automatic  image  processing  techniques  to  filter  imagery  for  NRT  target  detection. 

Automatic  image  processing  for  Imagery  filtering  can  be  subdivided  into  two 
categories,  automatic  image  screening  and  automatic  change  detection.  Automatic  image 
screening  employs  automated  techniques  for  logically  processing  imagery  to  determine 
the  presence  of  target  type  patterns  In  imagery.  Early  attempts  at  automatic  screening 
and  change  detection  employed  complex  optical  filtering  techniques  for  detecting  and 
identifying  target  patterns  and  changes.  Size,  complexity,  and  related  problems  resulted 
in  very  little  success  with  this  approach.  False  alarm  and  missed  target  rates  were 
unacceptable.  The  technology  explosion  in  digital  processing  hardware  lead  to  the 
application  of  digital  techniques  for  digital  image  processing.  Digital  techniques  are 
currently  employed  for  automatic  image  screening  and  for  automatic  change  detection. 
Description  of  each  of  these  image  filtering  techniques  follows. 

The  approach  for  automatic  image  screening  for  NRT  target  detection  is  based  on 
technology  that  has  evolved  from  guidance  systems  used  in  air  to  air  rockets,  infrared 
heat  seekers,  and  recent  developments  In  digital  target  detection  techniques  that 
utilize  pipeline  processing  technology  to  accomplish  the  target  detection.  This  approach 
is  cost  effective  using  firmware  and  not  software.  In  the  case  of  infrared  the  thres¬ 
holding  of  signals  allows  the  detection  of  warm  targets  against  a  cooler  background  and 
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thus  target  detection  becomes  possible,  however,  for  electro-optical  and  radar  sensors 
the  problem  is  more  difficult.  The  baseline  for  target  detection  in  real-time  or  near 
real  time  requires  use  of  pipeline  technology  that  automatically  performs  the  same 
functions  on  all  of  the  image  in  a  pipeline  fashion.  This  requires  that  the  data  be 
normalized  first  by  a  look-up  table  and  at  throughput  rates  such  that  the  detector  can 
be  operating  on  data  optimized  for  target  detection.  The  next  process  in  the  pipeline 
is  usually  some  kind  of  filtering  either  low  pass  or  high  pass  filtering  to  perform  a 
screening  function  by  eliminating  unwanted  data  bits  and  extraneous  signals.  The 
actual  target  detection  may  consist  of  a  gradient  analysis  and  polygon  fit  with  select¬ 
able  thresholds.  The  thresholds  can  be  in  intensity,  area  and  shape.  The  setting  of 
automatic  threshold  control  circuits  determines  the  false  alarm  rate  and/or  the  percent 
of  missed  targets.  This  approach  is  supported  by  technology  efforts  in  industry  and 
universities  that  provide  for  development  of  VLSI  chips  that  can  be  applied  to  the 
target  identification  task  in  the  future. 

Target  detection  is  accomplished  on  the  whole  image  format  by  using  real  time 
decimation  and  nearest  neighbor  interpolation  processes  to  reduce  the  image  to  an 
optimized  scale  for  detection  but  without  losing  the  resolution  required  for  target 
detection.  The  decimation  process  should  be  selectable  with  a  range  from  1  to  X  pixels. 

The  target  detection  would  take  place  directly  from  the  main  data  stream  of  digital 
imagery  data  prior  to  storing  the  full  resolution  data  in  disk.  This  will  result  in 
the  target  detector  providing  a  cue  to  the  control  computer  that  a  target  exists  at  a 
given  line  and  pixel  value  and  that  a  sub-image  containing  that  target  should  immediately 
be  passed  to  the  first  available  interpreter  for  verification.  This  eliminates  the 
need  for  interpreters  to  screen  the  complete  image  for  the  detected  target. 

It  is  planned  that  automatic  identification  of  targets  will  be  accomplished  by 
carrying  the  pipeline  process  several  steps  further  by  employing  a  multi-dimensional 
filtering  capable  of  performing  the  identification  task.  The  outputs  of  the  identifier 
would  be  annotated  at  time  of  verification. 

In  the  case  of  poor  target  detection  results  the  interpreter  can  select  a  screening 
mode  that  will  automatically  change  sub-images  for  him  and  allow  him  to  screen  and 
detect  targets  with  computer  assistance. 

An  alternate  concept  for  filtering  NRT  imagery  employs  automatic  change  detection. 
RADC  had  been  involved  in  the  development  of  change  detection  techniques  for  Synthetic 
Aperture  Side-Locking  Airborne  Radar  (SLAR)  since  1965.  Computer  simulations,  using 
large  general-purpose  computers,  provided  the  feasibility  of  automatic  digital  change 
detection  between  pairs  of  SLAR  film  images  in  the  late  60's.  These  studies  lead  to 
developed  digital  procedures  for  registration  of  the  film  pairs  that  were  more  exact 
than  any  analogue  capability.  The  simulation  studies  led  to  the  development  of  a 
hardwired  processor  to  perform  the  matching  and  change  detection  algorithms.  This 
system,  known  as  the  ARRES  system,  was  delivered  to  RADC  In  August  of  1972,  and  de¬ 
monstrated,  on-line  in  October  of  that  year. 

As  a  result  of  this  success  various  refinements  and  additions  to  the  simulation 
software  indicated  a  need  for  a  change  detection  system  with  the  flexibility  to  adapt 
to  algorithm  refinements.  To  meet  the  new  sensor  developments  a  breadboard  special- 
purpose  processor  was  built  and  demonstrated  in  January  1973.  An  advance  development 
model,  known  as  the  Digital  Modular  Change  Detection  (DMCD)  system  was  built  and 
delivered  in  1977.  This  system  is  currently  being  modified  to  provide  a  demonstration 
of  automated  advance  large  area  exploitation  techniques  for  the  UPD-iJ  high  resolution 
SLAR  sensor  in  USAFE  in  1981. 

These  automated  systems  for  filtering  NRT  imagery  have  been  touted  by  the  R&D 
community  as  the  process  for  processing  high  volumes  of  imagery  to  screen  and  detect 
target  activity.  Successes  to  date  have  been  limited  to  laboratory  demonstrations 
under  controlled  conditions.  Results  to  date  indicate  problems  in  two  areas: 

.  False  Alarms 

.  Missed  Targets 

False  alarms  are  situations  when  the  target  detector  or  change  detector  indicates 
targets  where  no  target  exists.  In  an  attempt  to  reduce  false  alarms,  research  activity 
has  been  directed  at  techniques  for  utilizing  logic  and  automated  pattern  analysis 
techniques  to  verify  target  areas  and  reject  false  alarms.  Another  approach  to  reduce 
false  alarms  has  been  to  Increase  thresholds  and  criteria  for  target  detection.  This 
approach,  however,  also  increases  the  probability  of  missed  targets.  (Targets  not 
detected  in  the  autodetection  or  change  detection  process.)  Although  operational 
unproven,  these  technologies  held  much  promise  for  future  NRT  imagery  exploitation 
systems.  With  further  development  image  screening  devices  will  be  adapted  for  in¬ 
flight  application.  In  airborne  application  the  filter  would  be  applied  in  the  airborne 
platform,  significantly  reducing  the  air  to  ground  imagery  data  link  data  rates. 

However;  before  airborne  or,  for  that  matter,  ground  applications  are  considered;  very 
careful  analysis  of  operational  test  data  for  false  alarm  rates,  missed  targets  and 
other  related  issues  must  be  accomplished. 


To  meet  these  objectives,  the  Data  Handling  Recording  System  (DHRS)  Is  currently 
under  development  at  USAF/RADC .  With  the  aides  designed  within  the  DHRS  the  equipment 
is  uniquely  adapted  to  performing  NRT  Interpretation  of  the  imagery  data.  The  basic 
DHRS  reporting  facility  is  essentially  composed  of  four  distinct  parts  or  modules 
(refer  to  figure  1):  SIM  (Sensor  Input  Module),  S/RM  (Storage/Retrieval  Module),  RTPM 
(Real  Time  Processing  Module),  aid  NRTEM  (Near  Real-Time  Exploitation  Module).  The 
digital  imagery  data  will  be  stored  In  the  S/RM  with  the  digital  image  represented  as  a 
complete  entity . 

A  ground  target  screening  system  will  provide  for  the  detection  of  man-made  objects 
and  for  the  classification  of  targets  based  on  geometric  characteristics  (e.g.,  edge 
data,  object  brightness,  symmetry).  Classification  of  objects  (e.g.,  tanks,  trucks)  Is 
based  upon  object  size,  texture,  contrast,  edge  straightness,  and  components  (e.g., 
wheels,  tracks).  The  extent  of  target  classification  that  can  be  incorporated  into  the 
Real  Time  Processing  Module  will  be  limited  by  the  state-of-the-art  of  that  technology. 
The  classifier  portion  of  the  Real  Time  Processing  Module  will  be  designed  so  that 
modification  (software  and  hardware)  can  be  incorporated  as  technology  advances.  RTPM 
will  (upon  the  detection  of  a  man-made  object)  extract  the  portion  of  the  frame  in 
which  the  object  Is  located  for  further  analysis  (classification  and  display).  The 
object  in  the  displayed  scene  will  be  highlighted  so  that  it  can  be  detected  by  the 
display  operator  for  target  confirmation  at  the  exploitation  module.  The  highlighting 
symbol  (alpha  /  numeric )  appearing  on  the  display  will  indicate  the  type  of  target 
detected.  The  Storage  and  Retrieval  Module  will  provide  for  a  full-frame  imagery  storage 
interfaced  to  a  full  display  image  for  image  enhancement.  This  technique  is  better 
than  other  approaches  considered  since  only  pertinent  portions  of  the  original  (sensor) 
digital  Image  Is  extracted  and  less  important  portions  are  deleted.  This  will  give  the 
TAC  commanders  a  greater  flexibility  in  viewing  only  those  portions  of  imagery  that  are 
of  importance  for  timely  decision  and  strike.  A  computer  will  monitor  the  data  (I/O) 
and  functions  of  the  processing  systems.  Such  a  design  scheme  will  provide  a  centralized 
control  of  the  digital  storage,  enhancement,  and  dissemination  to  Tactical  Commanders. 

The  Near  Real  Time  Exploitation  Module  of  the  DHRS  will  provide  a  high  quality 
image  display,  alpha/numeric  annotation,  image  enhancement,  and  dissemination.  This 
will  consist  of  the  following  equipment: 

(1)  Digital  Image  Enhancement  Equipment 

(2)  High  Quality  Image  Display 

(3)  Image  Processor 

(4)  Alpha/Nun.eric  Software 

(5)  Dissemination  Subsystem 

In  a  typical  scenario,  the  image  screening  module  will  designate  and  determine 
man-made  objects  (MMO's)  (whether  the  objects  pertain  to  a  specific  geometric  shape; 
e.g.,  circles  (oil  tanks,  drums)  squares,  and  rectangles  (buildings).  After  a  desired 
scene  has  been  extracted  from  the  frame  and  analyzed  via  the  processing  module  the 
Image  Is  then  transferred  to  the  NRT  Exploitation  Module.  If  target  confirmation 
(classification)  cannot  be  determined,  the  operator  will  have  the  capability  to  pin¬ 
point  desired  pixels  (and  subsequent  locations)  to  a  particular  region  of  the  extracted 
frame.  Alphanumerics/Graphlcs  will  provide  the  capability  to  annotate  the  imagery  on 
the  imaged  display.  The  pattern  recognition  software  will  provide  appropriate  symbology 
t,o  identify  buildings,  roads,  target  positions,  and  other  distinguishable  cultural  and 
terrain  features.  The  Alpha/Numerics  (Graphics)  will  provide  the  capability  of  annotat¬ 
ing  with  appropriate  symbology  particular  target  areas  of  interest.  The  Digital  Image 
Enhancement  will  provide  any  required  enhancement  to  the  image  displayed. 

The  Display  will  be  the  central  unit  around  which  a  facility  configuration  will  be 
assembled.  This  will  consist  of  a  high  quality  CRT  display  wit:,  full  and  psuedo  color 
capability,  zoom  and  scroll  features,  and  the  capability  to  store  selected  digital 
Imagery . 

The  Image  Processor  provides  the  display,  with  the  necessary  software  to  enhance 
the  displayed  region  by  correcting  for  radiometric  and  geometric  distortions,  providing 
filtering  functions  (high  and  low  frequency  discrimination),  edge  enhancement  and 
aligning  provisions.  The  main  CPU  (Controller)  provides  the  necessary  algorithms  for 
computation  of  ground  coordinates  via  a  priori  information;  e.g.,  mapping  functions, 
sensor  orientation,  and  INS  (Inertial  Navigation  System). 

Areas  such  as  automatic  target  screening  and  visual  image  enhancement  for  target 
detection/classification  are  applicable  to  the  accomplishment  of  the  target  detection 
function.  Emphasis  in  these  areas  is  on  automatic  processing  of  the  sensor  input  data 
rates.  An  operator  performing  the  target  detection  function  is  not  capable  of  in¬ 
teractively  manipulating  complex  systems  to  optimize  the  detection  of  targets  because 
of  the  extremely  high  data  rates  for  raw  sensor  data.  On  the  other  hand,  implementation 
of  image  processing  techniques  to  assist  in  performing  the  target  identification  function 
are  not  subject  to  this  same  criteria.  The  input  data  rates  for  the  operator  would  be 
substantially  less  and  the  operator  would  have  time  to  interact  with  the  system  to 
optimize  the  visual  display  of  a  target. 

The  primary  purpose  of  the  DHRS  is  to  provide  design  data  for  a  digital  system 
that  will  provide  real-time  targeting  Information  and  to  provide  a  technical  base  for 
the  design  of  an  airborne  display  for  use  in  future  recce/strike  aircraft.  The  DHRS 
will  also  be  used  as  an  evaluation  facility  for  advanced  real-time  sensor  systems  and 
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to  test  and  evaluate  false  alarm  rates,  missed  target  rates,  and  other  Issues  related 
to  the  automatic  processing  of  NRT  Imagery. 

The  system  proposed  will  reduce  the  time  now  required  between  intelligence  gathering 
and  resultant  decision  making.  The  major  technlcal/human  interface  will  be  the  in¬ 
tegration  of  the  components  which  will  reformat  and  input  the  real-time  data  and  perform 
target  detecting  and  image  enhancement  to  aid  in  the  human  interpretation  of  the  imagery. 
The  workload  of  the  human  interpreter  will  be  reduced  to  a  manageable  task  of  identifying 
targets  which  have  been  detected  via  the  DHRS  hardware/software  modules. 

Future  research  and  development  efforts  will  emphasize  the  ability  to  demonstrate 
actual  improvements  in  target  detection  and  identification.  As  an  example  of  this 
approach,  RADC  (Rome  Air  Development  Center)  is  currently  implementing  several  programs 
to  test  the  applicability  of  image  processing  techniques  to  the  real  time  environment. 

In  one  program,  sequences  of  video  frames  are  being  digitized  and  processed  by  real¬ 
time  algorithms.  The  sequences  of  video  frames  are  then  evaluated  using  human  in¬ 
terpreters  to  determine  their  applicability  to  the  real  time  environment.  Only  those 
image  processing  techniques  yielding  concrete  improvements  in  detection  or  identification 
will  be  considered  as  candidates  for  future  implementation. 

The  third  area  of  NRT  imagery  exploitation  is  target  location.  A  NRT  precise 
target  location  subsystem  is  essential  to  support  NRT  tactical  strike  concepts.  The 
current  approach  to  the  development  of  precise  target  coordinates  employs  maps  and 
charts  or  point  positioning  data  bases.  Coordinates  developed  from  maps  and  charts 
lack  precision  required  to  support  all  weather/night  strike  systems.  The  point  position 
ing  data  base  (PPDB)  system  is  currently  being  used  to  develop  precise  coordinates. 

None  of  these  systems,  however,  are  capable  or  have  the  growth  potential  of  being 
responsive  to  a  near-real-time  environment.  Storage  and  retrieval  of  the  PPDB  is  a 
manual,  time  consuming,  process.  Automatic  methods  for  handling  high  quantities  are 
not  available  and  would  be  extremely  expensive  to  develop  and  maintain  in  operational 
use.  Point  transfer  is  a  manual  or  computer-assisted  process  and  currently  is  both 
time  consuming  and  prone  to  error.  The  time  required  for  a  point  transfer  alone  in 
some  areas  can  consume  5-15  minutes  for  imagery  selection  and  system  set  up  and  5-10 
minutes  for  location  a  single  point.  To  achieve  the  NRT  goals,  the  location  function 
must  be  accomplished  in  seconds  and  minutes  and  not  impact  or  delay  the  detection, 
identification  or  reporting  processes. 

The  technical  approach  for  implementing  an  advanced  NRT  precise  target  location 
subsystem  is  directed  at  achieving  the  program  objective  of  deriving,  in  near-real- 
time,  the  precise  location  information  of  potential  targets  with  enough  accuracy  to 
support  (1)  identification,  (2)  the  decision  to  strike  or  attack,  and  (3)  the  execution 
function,  including  the  high  accuracy  required  for  night  and  adverse  weather  tactical 
weapon  delivery  systems.  The  approach  allows  for  the  development  of  precise  locations 
of  any  or  all  targets  from  multisensor  imagery. 

The  basic  location  subsystem  will  be  configured  around  state-of-the-art  technology, 
however,  a  modular  approach  to  implementing  the  target  location  subsystem  will  permit 
the  flexibility  to  incorporate  new  techniques  and  technology  as  they  are  developed. 

The  advanced  target  location  subsystem  development  provides  the  capability  to 
derive  precise  target  location  information  in  near-real-time.  An  all  digital  approach 
has  been  selected  that  will  provide  the  location  information  on-line  or  integrated  with 
the  functions  of  detection  and  identification  with  the  coordinates  automatically 
incorporated  into  the  report.  The  photo  interpreter  will  denote  a  detected  target  on 
the  display  subsystem  with  a  cursor  or  light  pen  and  command  the  system  to  determine 
precise  location.  The  target  location  subsystem  would  then  accomplish  the  necessary 
actions  to  derive  the  locations  in  whatever  coordinate  reference  system  is  desired,  and 
provide  the  coordinate  information  back  to  the  operator  for  inclusion  into  the  report. 

Target  location  will  be  accomplished  by  utilizing  proven  photogrammetric  techniques 
and  some  yet-to-be  developed  technology.  Two  major  methods  are  feasible  depending  on 
the  inputs  and  the  accuracy  of  those  inputs  available  from  reconnaissance  imaging 
sensors.  If  the  reconnaissance  sensor  data  include  very  accurate  vehicle  position  and 
attitude  information  and  the  math  model  of  the  sensor  is  known  to  a  high  degree  of 
accuracy;  a  direct  target  location  technique  can  be  developed  to  generate  accurate 
target  coordinates  very  rapdily.  However,  if  the  sensor  vehicle  position,  additude, 
and  math  model  information  is  not  available  or  is  not  highly  precise;  then  a  technique 
that  relies  on  a  point  positioning  data  base  (PPDB)  must  be  utilized. 

Both  techniques  (PPDB  dependent  and  PPDB  independent)  are  being  developed  because 
it  is  highly  probable  that  the  precise  vehicle  position  and  attitude  information  for 
all  of  the  sensors  will  not  be  available  for  some  time.  Technical  descriptions  for 
each  of  these  techniques  follow: 

The  point  positioning  data  base  dependent  technique  is  comprised  of  three  subsystems. 
They  are,  a  digital  point  positioning  data  base,  a  PPDB  storage  and  retrieval  device, 
and  a  digital  target  location  module. 

Exploratory  development  efforts  underway  at  RADC  are  currently  addressing  the  specific 
required  characteristics  and  will  develop  an  experimental  PPDB  for  subsequent  use  in 
NRT  target  location  demonstrations.  The  point  positioning  data  base  will  consist  of 
digital  image  data,  parametric  data,  digital  terrain  elevation  data  (MATRIX  type 
elevation  data)  and  possibly  digital  cartographic  data. 
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The  PPDB  storage  and  retrieval  device  Is  a  subsystem  for  storing  of  massive  amounts  of 
digital  Image  data.  A  typical  point  positioning  data  base  will  be  expected  to  cover 
approximately  100,000  square  miles  which  necessitates  a  very  large  storage  capability. 

In  order  to  meet  the  throughput  response  objectives  of  the  program,  this  digital  image 
data  must  be  randomly  accessible  In  a  matter  of  a  few  seconds.  The  data  transfer  rates 
must  also  be  very  high.  A  device  will  be  designed,  fabricated  and  Interfaced  to  the 
target  location  subsystem  processor  that  will  provide  the  necessary  storage  and  retrieval 
of  the  PPDB.  Optical  disk  technology  appears  very  promising  for  this  application. 

The  digital  target  location  module  will  be  composed  of  several  sub-modules  that  provide 
digital  image  viewing/mensuration,  a  point  transfer  capability,  processor(s)  and  software. 
The  viewer/mensuration  operator's  station  device  may  be  the  display  subsystem  used  to 
support  NRT  target  detection/identification  or  may  (of  necessity)  be  another  viewing 
device.  The  viewer  must  provide  a  display  of  reconnaissance  imagery  to  the  operator 
for  determination  of  the  pixel  location  of  the  target  within  the  overall  image  display. 

The  pixel  location  information  would  be  processed  using  the  sensor  geometry  position 
and  attitude  information  to  determine  approximate  ground  coordinates.  These  coordinates 
would  in  turn  be  used  to  retrieve  the  appropriate  PPDB  image  from  the  storage  and 
retrieval  device.  The  PPDB  would  be  displayed  and  the  operator  would  then  perform  a 
manual  or  a  computer-assisted  point  transfer.  The  ultimate  objective  is  to  develop  an 
automatic  point  transfer  capability.  This  capability  would  permit  point  transfer  of 
the  target  on  the  recce  image  into  the  PPDB  to  be  accomplished  much  faster  and  would  be 
completely  automatic.  With  the  point  transfer  accomplished,  the  processor  would  then 
perform  the  photogrammetric  calculations  to  derive  the  ground  coordinates.  The  final 
derived  coordinates  (x,  y,  z)  are  then  provided  to  the  operator  and  provided  to  the 
report  generation  subsystem. 

The  PPDB  independent  target  location  subsystem,  also  known  as  direct  target  location, 
is  addressing  the  capability  to  derive  target  location  information  without  the  use  of  a 
PPDB.  Direct  target  location  can  be  accomplished  when  the  attitude,  position,  sensor 
geometry,  and  image  plane  coordinates  are  known  accurately  enough  to  define  a  ray  in 
space  and  Intersect  it  with  the  earth's  surface.  The  PPDB  Independent  capability  would 
be  primarily  a  software  module  that  could  be  incorporated  into  the  system.  Only  the 
designation  of  the  target  on  the  display  screen  to  the  operator  would  be  necessary  to 
generate  accurate  coordinates.  Direct  target  location  is  complementary  to  PPDB  targeting 
and  covers  the  contingency  when  PPDB's  are  not  available. 

SUMMARY  AND  CONCLUSIONS: 

The  advent  of  digital  technology  to  support  NRT  image  exploitation  offers  the 
opportunity  to  develop  and  field  a  "universal"  digital  image  exploitation  system.  The 
"universal"  system  would  employ  a  "standard"  sensor  interface  such  that  all  NRT  digital 
Image  sensor  systems  would  be  supported  on  the  ground  by  a  common  image  exploitation 
system.  This  common  system  could  operate  relatively  independent  of  sensor  type,  source 
or  even  nationality.  Software  modules  would  be  used  to  configure  the  common  system  for 
the  sensor  to  be  exploited.  Future  Image  sensors  could  also  be  accommodated  In  this 
way. 


The  common  image  exploitation  system  offers  other  advantages.  As  automated  digital 
image  pattern  recognition  technology  is  developed  and  tested;  the  common  system  could 
be  modified  In  a  modular  fashion  to  take  advantage  of  these  advanced  techniques. 

Finally,  and  of  primary  significance  to  NATO,  the  common  system  concept  is  ideally 
suited  to  supporting  NATO  forces  in  that  the  ground  exploitation  station  can  be  made 
relatively  independent  of  sensor  collector  type  or  nation  of  origin.  The  flexibility 
of  the  common  system  concept  also  provides  opportunities  for  NATO  cooperative/joint 
research  and  development  programs  dealing  with  high  technology  needs  to  support  NRT 
image  exploitation.  The  following  are  potential  technical  areas  where  this  coopera¬ 
tive/joint  R&D  could  be  mutually  lucrative. 

.  High  Resolution  Displays 

.  Automatic  Target  Detection 

.  Automatic  Target  Identification 

.  Multi-Sensor  Correlation 

.  Automatic  Change  Detection  Techniques 

.  Standard  NATO  Sensor/Data  Link/Ground  Station  Interface  Definition 
.  Standard/Compatibility  NATO  NRT  Report  Format  Definition 

This  list  is  not  intended  to  be  all  inclusive.  It  is  suggested  as  a  catalyst  to 
promote  advocacy  of  joint  NATO  program  which  would  ultimately  result  in  a  NATO  NRT 
imagery  exploitation  system.  Comments  relating  to  mechanisms  for  initiating  such  a 
program  and  pros  and  cons  for  a  common  NATO  image  exploitation  program  are  encouraged. 
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SUMMARY 


An  important  role  for  target  acquisition  is  the  location  and  identification  of 
targets  in  near  real  time.  Current  technology  has  been  compartmented  into  sensors, 
processing,  air  or  ground  exploitation  and  finally  dissemination.  In  the  days  of  hot 
spot  or  radar  blip  detection,  this  segmentation  of  functions  was  appropriate.  With  the 
current  emphasis  on  real  time  decision  making  from  outputs  of  high  resolution  sensors 
this  thinking  has  to  be  re-analyzed.  A  total  systems  approach  to  data  management  must 
be  employed  using  the  constraints  imposed  by  the  atmosphere,  aurvivable  flight  profiles, 
and  the  human  workload.  This  paper  will  analyze  the  target  acquisition  through  classi¬ 
fication  tasks  and  discuss  the  machine  processing  and  data  screening  techniques  that  are 
applicable.  The  data  handling  capabilities  of  an  on-board  operator  and  ground  based 
image  interpreter  are  compared.  A  philosophy  of  processing  data  to  get  information  as 
early  as  possible  in  the  data  handling  chain  is  examined  in  the  context  of  ground 
exploitation  and  dissemination  needs.  Examples  of  how  the  various  real  time  sensors 
(screeners  and  processors)  could  fit  into  this  data  handling  scenario  are  discussed. 
Specific  DoD  programs  will  be  used  to  illustrate  the  credibility  of  this  integrated 
approach . 


THE  NEED  FOR  RAPID  TARGET  ACQUISITION 

Since  the  Blitzkrieg  type  of  warfare  was  successfully  demonstrated  in  World  War  II 
the  mobility  of  ground  forces  has  constantly  increased.  It  is  now  possible  for  very 
large,  well  equiped,  full  spectrum,  ground  forces  to  overrun  defenses  and  thrust  from 
tens  of  kilometers  to  a  hundred  or  more  kilometers  per  day,  obtaining  and  maintaining 
control  of  the  entire  area  encompassed,  or  to  rapidly  deploy  in  preparation  for  an 
attack.  Suffice  to  say  the  battlefield,  as  well  as  the  domain  of  air  operations,  can  be 
extremely  fluid. 

It  is  the  prime  purpose  of  tactical  reconnaissance  to  provide  the  theater  and  the 
field  commanders  with  the  necessary  information  about  the  enemy's  operational  situation 
to  successfully  conduct  operations,  both  air  and  ground,  in  such  a  fluid  environment. 
Obviously  this  information  must  be  timely. 

When  the  enemy  has  the  prerogative  and  the  opportunity  to  conduct  offensive 
operations,  the  need  for  timely  information  becomes  increasingly  critical.  The  defense 
must  effectively  react  while  the  opportunity  is  available  or  before  it  is  too  late  and 
their  capability  to  do  so  is  overrun  or  otherwise  negated.  In  this  type  of  highly  mobile 
warfare  tactical  aerial  reconnaissance  plays  a  vital  role,  being  the  most  mobile  of  all 
capabilities.  A  reconnaissance  aircraft  can  travel  at  a  rate  of  18  kilometers  per 
minute  or  more,  permitting  it  to  reach  an  area  of  concern  from  a  remote  base  in  a  few 
seconds.  The  crux  of  the  problem  is  to  make  the  information,  so  quickly  gatherable, 
useful  in  time  for  reaction.  To  date  this  has  been  effectively  accomplished  too 
infrequently. 

Traditionally,  airborne  reconnaissance  has  been  accomplished  by  a  reconnaissance 
vehicle  flying  over  the  area  of  interest.  A  variety  of  sensors  aboard  the  vehicle 
sense  and  record  the  reconnaissance  information  for  subsequent  processing,  interpretation 
and  reporting  of  the  digested  information.  These  functions  normally  occur  upon  landing 
at  the  reconnaissance  base.  The  time  involved  in  this  total  process,  from  sensing  to 
reporting,  varies  from  hours  to  weeks,  depending  upon  the  type  of  missions  and  the 
quantity  of  data  sensed  (the  extent  of  the  coverage  and  the  resolution) ,  and  the  type  of 
information  to  be  extracted.  This  time  factor  can  be  broken  down  into  functions  as 
follows : 

Air  Transportation  -  Time  for  the  vehicle  to  return  to  base  and  land.  (Typically  10 
min.  to  1  hour) 

Ground  Transportation  -  Time  to  transfer  the  recordings  from  the  vehicle  to  a  ground 
processing  station.  (5  to  20  min.) 

Data  (or  Image)  Reduction  -  Time  to  process  the  recordings  into  extractable  form. 
(10  min.  to  2  hours) 

Target  Detection  and  Location  -  Time  to  find  probable  targets/areas  of  interest  in 
the  total  recorded  take.  (2  min.  to  hours) 

Intelligence  Interpretation  -  Time  to  interpret  the  recordings  into  applicable  in¬ 
formation.  (5  min.  to  days,  or  weeks  in  the  case  of  maps).  This  usually  involves 
correlation  with  other  information. 
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Reporting  -  Time  for  reporting  procedures  (Preparation,  Dissemination).  (A  few  min. 
to  hours,  days  for  maps) 

The  above  time  factors  are  typical  or  average  values.  They  may  be  significantly 
less  for  some  reconnaissance  missions  where  the  number  of  specific  targets  to  be 
reconnoitered  is  small  and  urgency  procedures  can  be  applied  throughout  the  cycle.  For 
missions  in  which  the  information  perishability  factor  is  an  hour  or  more,  traditional 
capabilities  are  adequate  to  solve  the  problem.  They  will  continue  to  provide  the  bulk 
of  the  reconnaissance  and  intelligence  information  obtained.  However,  for  reconnaissance 
of  more  perishable,  or  more  urgent  targets  the  normal  reconnaissance-intelligence  cycle 
must  be  drastically  compressed  or  significantly  changed  into  a  real  or  near  real-time 
reconnaissance  capability  in  order  to  be  responsive  to  a  highly  mobile  threat. 

The  functions  of  target  detection,  location  and  interpretation  normally  involve  the 
human  eye  and  brain  processes  which  are  not  amenable  to  significant  time  compression. 
Consequently,  for  near  real-time  operation,  those  reconnaissance  jobs  requiring  sub¬ 
stantial  human  study  must  not  be  undertaken  or  the  process  will  rapidly  outgrow  its 
required  time  frame.  The  amount  of  information  for  which  the  human  is  tasked  should  be 
no  more  extensive  than  that  absolutely  required  for  the  primary  purpose  of  the  mission. 
Only  answers  to  simple  and  precise  reconnaissance  questions  can  be  expected  from  a  human 
observer  in  real  or  near  real-time.  It  follows  that  real  and  near  real-time  methods 
should  only  be  applied  to  obtain  the  minimum  information  essential  to  make  a  decision 
where  rapid  response  is  likely  to  be  required. 

The  historical  method  of  gathering  data  is  with  a  film  camera.  A  film/camera  data 
rate  of  10^  is  not  uncommon.  For  the  real-time  case  a  sensor  data  rate  of  106  to  10? 
pixels  per  second  as  experienced  in  typical  tactical  scenarios.  This  pixel  rate  number 
is  arrived  at  from  convoluting  the  ground  resolution,  swath  width  and  V/H  necessary  to 
adequately  perform  the  tactical  mission.  The  human  in  the  interpretation  task  has  a  10^ 
data  processing  rate.  (Gardiner  and  Nicholson,  1971). 

Because  of  the  limited  amount  of  information  which  a  human  can  handle  in  real  or 
near  real  time,  and  the  tremendous  amount  of  data  which  a  sensor  can  acquire  in  real 
time,  a  human  observer  in  the  reconnaissance  aircraft  may  well  miss  or  overlook  very 
important  information  which  may  not  be  quite  as  time  sensitive  as  that  of  initial  prime 
concern.  For  this  reason,  as  well  as  for  verifying  the  airborne  interpretations,  it  is 
important  that  permanent  recordings  be  made  of  all  sensed  information  for  more  thorough 
assessment  after  return  to  base,  even  though  the  intent  of  the  mission  is  one  of  obtaining 
a  specific  real  or  near  real  time  reconnaissance  answer. 

OPERATION  ASPECTS  OF  REAL-TIME  RECONNAISSANCE 


The  traditional  reconnaissance  information  process  cycle  was  broken  into  steps  and 
defined.  They  are  repeated  here  for  convenience-. 


1. 

Air  Transport 

4. 

Target  Detection  and  Location 

2. 

Ground  Transport 

5. 

Intelligence  Interpretation 

3. 

Data  Reduction 

6. 

Reporting 

The  use  of  a  wide  band  data  link  makes  possible  the  elimination  of  several  of  the 
above  reconnaissance  cycle  steps  and  the  possible  reduction  of  the  time  factors  of 
others.  For  example,  if  the  data  link  can  operate  directly  from  the  reconnaissance 
vehicle,  while  it  is  acquiring  the  information,  and  can  transmit  at  the  acquisition  rate, 
then  steps  1  and  2  above  are  eliminated.  In  addition,  step  3  can  be  reduced  by  operating 
the  recorder-processor  in  tandem  with  the  receiver,  at  the  information  acquisition  rate. 
Rapid  film  processing  technology  can  reduce  step  3  to  under  2  minutes,  under  special 
conditions  to  as  low  as  10-15  seconds.  Even  if  the  reconnaissance  aircraft  cannot 
transmit  while  acquiring,  a  buffer  storage  can  be  provided  in  the  aircraft  to  permit 
transmission  when  the  aircraft  reaches  a  position  from  which  it  can  transmit.  This  will 
not  eliminate  step  1  but  will  reduce  it  considerably;  step  2  will  still  be  eliminated; 
and  step  3  will  be  reduced  as  above.  Step  3  can  be  reduced  to  zero  by  use  of  a  dynamic 
display  of  the  information  instead  of,  or  in  parallel  to,  the  recorder. 

For  steps  4,  5,  and  6,  the  human  is  normally  involved  and  reduction  of  time  elements 
for  these  steps  can  only  be  reduced  by  changing  his  duties.  The  time  required  is  a 
function  of  the  area  covered  by  the  reconnaissance  per  unit  time,  the  number,  size  and 
deployment  of  targets  of  concern,  other  target  characteristics  such  as  smoke,  dust,  or 
tracks,  the  ability  to  extract  the  pertinent  information  from  the  scene  (clutter,  contrast), 
the  number  of  interpreters  employed,  the  skill  and  experience  of  the  interpreters,  and 
the  procedures  used.  The  target/background  characteristics  (such  as  clutter  and  contrast) 
are  always  dominant  factors. 

Real  and  near  real-time  transmission  of  reconnaissance  imagery  can  thus  reduce  the 
total  time  factor  between  acquisition  of  the  reconnaissance  and  reporting  to  a  matter  of 
approximately  10  minutes  up  to  an  hour  or  so.  Wide  band  data  transmission  is  thus  most 
useful  for  information  whose  perishability  factor  is  in  this  time  range.  However,  this 
conclusion  is  valid  only  when  the  human's  time  factors  do  not  dominate  the  equation. 
Otherwise,  the  process  becomes  swamped,  quickly  breaks  down,  and  is  useless. 


For  operation  in  a  high  jamming  environment  or  line  of  sight  conditions  where  a  wide 
band  data  link  capability  may  be  unuseable,  the  same  results  can  be  achieved  with  a 
dynamic  display  for  the  reconnaissance  crew  in  the  reconnaissance  vehicle  and  having  them 
perform  the  entire  process  and  report  via  a  narrow  band  link.  The  reconnaissance  cycle 
time  is  thus  reduced  to  the  time  required  for  the  reconnaissance  crew  to  detect  and 
recognize  (interpret)  the  targets  and  verbally  or  symbolically  report  their  findings. 

Within  certain  constraints  this  can  be  a  matter  of  seconds. 

If  the  reconnaissance  crew  is  delegated  the  authority  to  make  strike  decisions,  it 
is  now  a  reconnaissance/strike  capability.  This  reconnaissance/strike  process  has  been 
accomplished  for  many  years  for  a  limited  range  of  applications  and  under  restrictive 
conditions.  The  traditional  forward  air  controller,  operating  in  low-performance  air¬ 
craft,  is  an  example  of  the  entire  process  which  usually  includes  decision  making  and 
direct  or  indirect  control  (target  designation)  of  strikes.  Until  recently  the  principal 
reconnaissance  sensor  of  the  forward  air  controller  has  been  the  human  eye,  unaided  or 
aided  with  various  types  of  visual  optical  sights.  In  addition,  a  substantial  amount  of 
reconnaissance  information  has  been  collected  visually  by  strike  pilots  as  an  adjunct, 
or  in  addition,  to  their  primary  strike  missions.  In  these  cases  the  process  has  been 
in  real  or  near  real  time  although  no  reconnaissance  sensors  nor  displays  of  pictorial 
reconnaissance  information,  per  se,  were  necessary.  However,  the  conditions  of  operation 
have  been  restricted  to  daylight  and  good  visibility. 

Sensor  technology  now  permits  the  target  acquisition  capability  to  be  extended  to 
nighttime  operation  and/or  conditions  of  visibility  well  beyond  the  eye's  unaided 
capability.  In  addition,  depending  on  the  scenario,  the  near  real  time  target  acquisition 
capabilities  may  be  accomplished  from  high  performance  aircraft.  In  these  cases  the 
target  rate,  clutter  and  contrast  become  more  and  more  important.  While  much  work  is 
progressing  to  model  and  quantify  the  effect  of  the  physical  variables  on  the  human 
performance  of  detecting  and  recognizing  targets  at  real  time  rates,  the  variables  are 
so  complex,  numerous  and  interrelated  that  the  problem  cannot  be  generalized.  The 
remainder  of  this  paper  will  discuss  techniques  to  assist  in  the  real/near  real  time 
performance  of  sensors/humans  to  perform  target  acquisition  missions  and  the  operational 
limitations  of  such  techniques. 

One  of  the  prime  functions  of  aerial  reconnaissance  is  to  maintain  periodic 
surveillance  of  a  large  area  of  concern  to  alert  the  commander  to  enemy  activity  or 
movements.  As  these  missions  must,  of  necessity,  be  flown  at  high  altitudes  in  order  to 
survey  wide  areas,  line  of  sight  to  the  receiving  station  (ground  or  airborne)  can  be 
-achieved.  Use  of  directive  and  tracking  antennae  reduces  the  susceptibility  to  jammers 
not  directly  or  nearly  in  the  line  of  sight.  Only  an  airborne  jammer  nearly  in  the  line 
of  sight  can  effectively  jam  the  receiver  and  thus  the  transmission. 

For  the  surveillance  mission,  where  activity  is  the  prime  factor  of  interest,  targets 
need  not  be  recognized  as  such.  Just  changes  or  movements  or  other  unique  features  of 
detected  objects  need  be  observed.  Of  course,  the  objects  detected  must  have  a  high 
probability  of  being  of  military  significance,  or  this  relationship  must  somehow  be 
Reducible.  For  this  purpose  cues  can  be  used.  The  most  readily  available  cues  are: 

1.  Motion  (MTI  Radar) 

2.  Change  detection  (Radar) 

3.  Electromagnetic  Emissions 

4.  Heat  Emissions  (Infrared  Sensor,  weather  permitting) 

Obviously  the  degree  of  normal,  non-military,  activity  of  the  surveyed  area  must  be 
relatively  low.  A  convoy  of  military  vehicles  on  a  busy  highway  will  be  detected  but 
only  supplemental  information  will  differentiate  them  from  normal  civilian  traffic.  On  a 
little  used  secondary  road,  however,  the  appearance  of  many  moving  objects  would  be  cause 
for  alert. 

In  any  case  the  decision  of  where  and  how  to  react  will  probably  require  a  specific 
reconnaissance  mission  to  determine,  through  recognition,  what  the  motion  or  activity 
really  means.  The  most  useful  purpose,  then,  of  a  surveillance  mission  is  to  highlight 
points  or  small  areas  for  more  detailed  coverage,  as  well  as  to  provide  alert  warning.  As 
this  mission  is  accomplished  from  high  altitude  and  covers  large  areas,  radar,  either  MTI 
or  side  looking-snythetic  aperture,  or  more  desirably  a  combination  of  both,  can  provide 
an  all  weather  surveillance  capability  of  wide  areas  for  activity  indication  and  alert 
purposes.  The  information,  either  raw  or  partially  processed,  may  be  linked  to  the 
ground  via  wide  band  data  link  and  automatically  processed  into  imagery  for  manual  read 
out.  If  provided  in  digital  form,  the  information  may  be  automatically  compared  to  in¬ 
formation  from  a  previous  pass  of  the  same  area  in  a  digital  change  detector  with  all 
changes  superimposed  on  the  radar  map.  Such  a  radar  surveillance  capability  may  be 
augmented  with  electronic  reconnaissance  and/or  infrared  for  the  detection  of  targets  with 
those  specific  characteristics.  For  an  analog  system  the  human  is  required  to  look  at 
imagery  and,  depending  on  the  situation,  could  soon  take  so  much  time  as  to  obviate  the 
advantages  of  real-time  transmission.  Automatic  digital  change  detection  can  overcome  this 
limitation,  and  provide  the  necessary  movement  information  within  the  allowable  time. 


For  detail  examination  of  specific  points,  small  areas  or  routes  which  have  been 
highlighted  by  surveillance  missions,  or  other  sources,  as  requiring  closer  scrutiny, 
specific  reconnaissance  sorties  are  needed  to  provide  the  necessary  information.  Today's 
state-of-the-art  in  automatic  target  identification  is  not  sufficiently  advanced  to  pre¬ 
clude  the  need  for  human  recognition  of  targets  on  which  to  base  strike  decisions  or 
tactical  employment  commitment.  This  is  not  to  say  that  exigencies  of  the  situation  may 
not  require  action  based  on  such  a  dearth  of  confirmatory  information.  The  operational 
situation  may  be  critical  enought  to  "blast  every  blip  in  sight"  but  in  the  absence  of 
the  need  for  such  desperate  measures  it  seems  fair  to  assume  that  commanders  will  require 
more  positive  identification,  based  upon  recognition  or  at  least  reliable  classification, 
before  major  commitment  of  resources.  Automatic  classification  techniques,  discussed 
later,  are  under  development,  but  sensors  providing  sufficient  information  to  the  pro¬ 
cessor,  plus  processor  accuracy,  completeness  and  low  enough  false  alarm  rates  for  i 

operational  reliance  have  yet  to  be  flight  demonstrated  and  proven. 

As  humans  must,  as  yet,  do  the  recognizing  of  targets  this  must  predominantly  be 
done  by  virtue  of  the  shape  and  size  of  the  target.  Today's  radars  or  microwave  sensors 
do  not  provide  sufficient  resolution  for  recognition  of  most  tactical  targets  by  their 
shape.  Their  resolution  is  improving  (with  complications)  but  many  unanswered  problems, 
such  as  specularity,  susceptibility  to  electronic  countermeasures,  etc.,  need  resolution. 

Thus,  the  primary  recognition  sensors  (other  than  direct  eyeball)  must  now  be  electro- 
optical  or  infrared  imagers  which  present  visual-like  information  to  the  human.  Operating 
in  the  optical  wavelengths  (less  than  300  micrometers) ,  these  sensors  are  susceptible  to 
atmospheric  degradations.  They  cannot  operate  through  clouds  so,  in  the  presence  of 
cloud  cover,  they  must  be  used  below  the  ceiling.  Also,  due  to  atmospheric  haze,  sensor 
performance  improves  as  the  distance  to  the  target  is  decreased.  Sometimes  (in  heavy 
fog)  they  are  totally  ineffective. 

The  atmosphere  factors,  plus  the  current  trend  of  flying  very  low  to  survive  in 
today's  high  threat  environment,  dictate  that  this  type  of  reconnaissance  mission  be 
performed  at  low  altitude. 

Low  altitude  operation  over  hostile  territory  poses  two  very  severe  problems  for  wide 
band  data  links.  Such  links  require  line  of  sight  from  the  transmitting  aircraft  to  the 
receiver.  If  the  transmitting  aircraft  is  at  low  altitude,  terrain  masking  prohibits  the 
maintenance  of  line  of  sight  to  a  ground  receiver.  The  receiver,  either  as  a  relay  or  as 
the  interpretation  station,  must  be  at  high  altitude.  Such  a  receiver  is  highly  suscep¬ 
tible  to  jamming  by  very  unsophisticated  means.  Its  overall  operating  cost  is  high  and 
it  is  a  lucrative  target  for  direct  attack.  Even  if  the  receiving  station  employs  highly 
directional  antennae  tracking  the  transmitting  reconnaissance  aircraft,  if  the  jammer  is 
anywhere  in  the  vicinity  of  the  reconnoitered  target  area,  then  the  jamming  can  only  be 
overcome  by  one  of  the  following: 

a.  Recording  the  reconnaissance  data  for  transmission  when  the  reconnaissance  air¬ 
craft  proceeds  away  from  the  jammers  (if  it  climbs  to  high  altitude  then  it  can  transmit 
directly  to  the  ground,  as  does  the  surveillance  mission,  and  a  relay  if  not  required  but 
the  highly  directional  tracking  receiving  antenna  is) . 

b.  Transmitting  a  moderate  information  band  width,  over  a  very  high  transmission 
band  width  digital  data  link  while  using  sophisticated  digital  processing  techniques  to 
overcome  the  jamming.  This  restricts  the  transmitted  information  band  width  to,  at  best, 
several  hundred  kilobits  per  second  (a  high  altitude  receiver  with  directional  tracking 
antennae  is  still  required) . 

c.  Observing  the  target  areas  and  interpretation  of  the  target  situation  by  the 
reconnaissance  crew  with  direct  or  subsequent  reporting  over  a  narrow  band  width,  jam 
resistant  (possibly  secure),  digital  data  link  by  voice,  symbology,  or  very  limited 
pictorial  coverage. 

REAL  TIME  PROCESSING  FOR  TARGET  ACQUISITION  AND  CLASSIFICATION 

Cueing  for  the  wide  area  surveillance  mission  has  already  been  discussed.  Use  of 
autonomous  cueing,  screening  or  target  classification  techniques  on  specific  reconnais¬ 
sance  missions  would  enhance  the  search  capability  considerably  to  accomplish  reac¬ 
quisition  of  targets  which  have  moved,  or  to  overcome  positional  errors  in  either  the 
original  target  location  or  in  the  navigation.  They  may  even  provide  some  capability  for 
finding  targets  of  opportunity.  If  the  cueing  technique  is  automatic  the  reconnaissance 
crew  or  the  ground  interpreters  are  assisted  in  the  search  function  (a  function  which  the 
human  does  very  poorly  but  machines  can  do  well)  and  permits  their  concentration  on  the 
recognition  and  identification  functions  (which  machines  do  poorly  but  humans  do  well) . 

Several  cueing  techniques  will  be  discussed  below,  along  with  their  applications  in 
conjunction  with  an  identification  sensor  to  provide  a  localized  search/identification 
capability.  These  cueing  techniques  are  highly  sensor,  atmosphere  and  target  dependent. 

Specific  unique  target  characteristics  are  used  and  false  alarm  (non-targets  with  similar 
unique  characteristics)  rates  are  constantly  a  problem.  While  clutter,  without  target- 
like  characteristics,  is  no  problem,  in  a  highly  target  rich  environment  cuers  can  become 
saturated.  In  this  case,  selection  of  a  particular  cued  target  for  identification  must  be 
either  manual  or  random. 
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CUEING  USING  THE  IDENTIFICATION  SENSOR 


Infrared  cues.  Under  many  conditions  certain  types  of  targets  emit  strong  infrared 
radiation  by  virtue  of  their  temperature/emissivity  characteristics.  These  emissions  may 
be  detected  at  long  range  by  a  FLIR  operating  in  the  "wide"  angle  mode  near  the  horizon, 
although  target  masking  by  terrain  or  foliage  may  be  a  problem.  The  sensor  may  be  "panned” 
slowly  across  the  forward  horizon  for  the  detection  of  such  emissions.  Upon  detection  the 
narrow  angle  of  the  FLIR  may  be  employed,  centered  on  the  "hot  spot"  and  closure  employed 
until  recognition.  In  the  case  of  a  downward  looking  infrared  sensor,  hot  spots  on  the 
imagery  automatically  draw  attention  to  specific  points  and  their  immediate  surrounds. 

While  hot  spot  cueing  is  inherent  in  an  infrared  sensor  it  is  not  effective  if  a  prolifi- 
cation  of  hot  spots  appears. 

Automatic  Screening.  Considerable  work,  with  some  success,  has  been  accomplished  on 
applying  automatic  screening  techniques  to  analog  video  signals,  especially  from  infrared 
sensors  (both  line  scan  and  FLIR) .  Simple  algorithms  have  been  developed  to  detect  tar¬ 
gets,  with  the  first  iteration  being  on  edges  and  lines  characteristic  of  man-made  objects 
and  further  refinement  being  made  on  size  and  geometry.  The  screener  indicates  which 
areas  do  or  do  not  contain  man-made  objects.  The  further  refinements  discard  such  man¬ 
made  objects  as  roads  and  large  buildings.  Detection  probabilities  on  vehicles  have  been 
high,  but  false  alarm  rates  are  highly  dependent  on  the  background  due  to  the  fact  that 
features  of  vehicles  (size,  aspect  ratio,  etc.)  used  in  the  screening  process  are  the  same 
as  those  of  some  non-vehicle  man-made  objects  of  no  interest.  The  principal  virtue  of 
this  technique  is  that,  in  real  time,  it  automatically  discards  areas  of  a  scene  wherein 
the  probability  of  there  being  a  man-made  object  is  low,  thus  improving  search  time  by  a 
human.  However,  in  a  highly  cluttered  urban  scene  very  little  area  is  discarded,  limiting 
the  usefulness  of  this  technique  to  rural  or  natural  backgrounds.  The  screener  output  may 
be  superimposed  symbolically  on  the  observer's  display  and/or  record  either  in  the  cockpit 
from  a  FLIR  or  line  scan  sensor,  or  at  the  receiving  station,  to  highlight  those  small 
areas  where  the  probability  of  there  being  a  target  is  high. 

The  Avionics  Laboratory  is  currently  pursuing  a  more  advanced  form  of  screening 
technology.  The  early  thermal  sensors  provided  the  observer  sufficient  resolution  to 
detect  hot  spots.  As  thermal  sensor  performance  improved,  the  image  quality  became 
adequate  to  operate  on  the  external  features  of  the  targets  (edges,  length/width,  shape). 
The  screener  technology  was  considered  an  add-on  to  a  sensor  designed  for  a  human 
observer.  The  current  Avionics  Laboratory  thrust  is  to  let  the  screener/processor  re¬ 
quirement  drive  the  sensor  design.  Sensor  trade-offs  will  be  made  in  the  areas  of 
field  of  view,  dynamic  range,  resolution,  sensitivity  and  image  enhancement. 

The  number  of  detectors  on  the  focal  plane  has  been  limited  to  some  degree  by  the 
practical  limits  of  the  display;  525  or  875  lines  on  the  display  were  the  two  most 
popular  sizes.  The  limit  was  really  the  human  observer's  inability  to  cope  with  the 
band  width  fed  to  him  from  the  display.  The  other  driving  limit  on  number  of  detectors 
was  the  desire  to  maximize  the  field  of  view  to  allow  the  observer  to  pilot  the  aircraft. 
Now,  with  the  introduction  of  image  screening  and  the  advances  in  butted  infrared  focal 
plane  arrays,  concepts  for  sensors  with  thousands  of  detectors  are  conceivable  tor  the 
piloting  task.  The  large  focal  plane  can  be  used  to  display,  at  much  reduced  resolution, 
a  wide  field  of  view.  For  the  target  acquisition  task,  the  large  focal  plane  allows 
sufficient  resolution  to  be  achieved  over  a  wide  field  of  view.  The  screener,  operating 
on  this  full  resolution,  can  search  the  larger  area  and  cue  potential  targets.  The  cued 
targets  can  then  be  highlighted  on  the  low  resolution  wide  field  of  view  display  on  a 
blow  up  of  the  target  and  be  displayed  to  the  observer  at  full  resolution. 

The  image  screener  is  able  to  handle  real  time  band  width  orders  of  magnitudes 
greater  than  those  handled  by  the  observer.  It  is  now  practical  to  design  the  thermal 
sensor  to  give  an  output  that  allows  the  screener  access  to  more  target  information 
than  would  be  presented  to  the  observer.  The  dynamic  range  of  Electro-Optical  multi¬ 
plexed  FLIRs  is  in  the  20-30  db  range.  Digital  multiplexing  would  allow  the  dynamic 
range  to  expand  to  50-70  db.  The  additional  information  cannot  be  displayed  but  can 
be  utilized  by  the  screener.  The  same  logic  is  true  for  types  of  enhancements.  The 
human  and  the  processors  have  different  needs.  The  conclusion  of  this  trend  is  that  it 
is  now  possible  to  design  a  sensor/processor  system  that  provides  imagery  of  sufficient 
quantity  that  the  screener  is  able  to  do  a  much  improved  job  of  false  alarm  rejection  or 
target  classification.  Syntactic  image  screening  is  one  example. 

Syntactic  screening  means  using  the  geometric  relationship  of  the  internal  target 
structure  to  provide  more  information  for  decisions.  A  tank  has  a  hot  engine  compartment, 
hot  exhaust  ports  and  sometimes,  hot  treads.  The  geometric  relationship  of  these  heat 
sources  is  known.  If  the  heat  sources  are  present,  the  screener  can  utilize  them  to 
further  improve  the  probabilities  of  correct  classification.  In  the  past,  we  taught  the 
image  screeners  to  recognize  external  shapes  such  as  circles  or  triangles.  Now  we  can 
use  the  improved  sensor  thermal  or  special  resolution  to  operate  on  the  internal  shapes. 

The  screener  can  be  taught  a  circle  on  top  of  a  triangle  is  an  ice  cream  cone.  In  cases 
where  part  of  the  target  is  masked,  this  internal  structure  is  useful  for  separating 
potential  from  false  targets. 
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CUENG  USING  A  SEPARATE  SENSOR 

Radar.  If  Che  specific  reconnaissance  aircraft  has  a  radar  capable  of  detecting 
military  targets  plus  an  MTI  capability  it  can  provide  cues  on  which  to  point  the  TV  or 
FLIR  sensor  for  recognition  and  identification  or  to  mark  the  line  scan  imagery.  The 
radar  must  be  capable  of  acquiring  targets  at  low  depression  angles  but  at  moderate 
ranges  (less  than  10  Km)  from  low  altitudes.  It  must  be  capable  of  detecting  the  same 
types  of  targets  that  the  surveillance  radar  does  in  order  to  reacquire  those  which  have 

moved  in  the  interim.  At  low  altitude  operation  at  very  low  depression  angles  foliage 

masking  becomes  a  severe  problem  in  many  geographical  areas.  This  problem  may  be  over¬ 
come  by  using  long  wavelengths  (low  frequencies)  or  combinations  of  two  or  more  long 
wavelengths.  Such  techniques  have  been  proven  to  detect  military  targets  in  heavy 
foliage.  This  radar  need  not  look  forward  but  can  operate  in  the  side  looking,  synthetic 
aperture,  digitally  processed  mode.  As  it  need  only  detect,  not  recognize,  its  resolution 
can  be  poor  and  it  need  not  "map"  the  terrain,  just  provide  the  direction  and  range  to 
point  the  identification  sensor  or  mark  the  line  scan  imagery.  At  low  resolution,  band 
width  is  low  and  cost  can  therefore  be  quite  low  compared  to  most  radars.  The  TV  or  FLIR 

can  then  be  zoomed  in  on  the  suspected  target  by  virtue  of  approach  and  by  narrowing  the 

field  of  view  with  the  operator  concentrating  on  identifying,  not  searching.  A  finite 
number  of  false  alarms  can  be  tolerated  as  long  as  principal  targets  of  interest  are 
detected  and  the  clutter  is  not  too  great. 

The  Avionics  Laboratory  is  currently  evaluating  a  digitally  processed  low  frequency 
SAR  radar,  called  IMFRAD  (Integrated  Multi-Frequency  Radar).  Preliminary  results  indicate 
the  ability  to  detect  tactical  targets  masked  by  camouflage  and  foliage  from  low  altitude 
standoff  surveillance  ranges.  The  IMFRAD  radar  information  is  processed  on  board  the 
aircraft  in  real  time  and  does  detect  both  stationary  and  moving  tactical  sized  targets. 
Because  of  the  automatic  processing  (the  radar  detects  only  the  metal  vehicular  sized 
objects)  the  IMFRAD  type  radar  can  be  used  to  search  large  areas  and  cue  a  high  resolution 
sensor  for  the  final  target  classification. 

Three  Dimensional  Target  Classification  -  Recent  experiments  with  a  modulated,  laser 
line  scan  sensor  have  indicated  the  possibility  of  a  very  discrete  automatic  target 
classification  capability  using  phase  information  to  very  accurately  discriminate  specific 
military  objects  by  virtue  of  their  shape  and  height  characteristics.  Automatic,  real 
time  classification  of  specific  types  of  targets  has  been  demonstrated  with  a  high  prob¬ 
ability  of  classification  and  a  low  false  alarm  rate.  To  date,  coverage  of  such  a  sensor 
has  been  restricted  to  no  more  than  20°  from  nadir  so  its  applicability  as  a  cuer  for 
human  confirmation  with  a  FLIR  or  TV  in  a  high  speed  vehicle,  or  for  very  wide  angle 
coverage,  remains  to  be  demonstrated.  Its  output  can  be  directly  applied  to  a  downward 
looking  sensor.  In  fact,  being  a  line  scanner,  it  has  its  own  pictorial  output  in  the 
reflectance  domain  as  well  as  a  three  dimensional  image. 

The  Air  Force  is  embarking  on  a  program  titled  Target  Cueing  and  Classification 
Sensor.  This  program  is  designed  to  provide  USAF  with  a  real  time,  airborne,  target 
classification  (recognition)  capability  for  the  low  altitude,  high  speed,  penetration 
mission.  The  sensor  will  be  capable  of  obtaining  three  dimensional  spatial  information 
and  classifying  targets  based  on  shape  discrimination  to  provide  a  high  probability  of 
detection  and  a  low  false  alarm  rate.  The  system  will  process  the  scene  data  in  real 
time,  <0.1  sec,  and  provide  the  user  with  the  target  name  and  location  with  respect  to 
aircraft  position. 

This  automatic  target  classification  capability  will  provide  usable  reconnaissance 
information  in  real  time  for  tactical  missions.  This  is  especially  valuable  against 
mobile  tactical  vehicles  such  as  SAMs,  AAA,  tanks,  etc.  This  system  will  have  direct 
application  in  the  TAC  Quick  Strike  Reconnaissance  and  Strike  Control  and  Reconnaissance 
missions  as  well  as  in  conventional  real-time  reconnaissance.  Since  targets  are  classified 
and  located  in  real  time,  information  can  be  transmitted  to  the  user  over  a  low  band 
width  data  link.  The  cues  will  also  alleviate  the, human  saturation  problem  experienced 
with  uncued  high  band  width  imagery. 

Multispectral  Cueing.  Use  of  two  or  more  sensors  or  one  sensor  operating  in  more 
than  one  spectral  band,  or  wavelength,  can  provide  information  which  can  help  to  dis¬ 
criminate  targets  from  their  backgrounds.  This  is  particularly  appropriate  to  camouflaged 
targets  in  which  other,  readily  sensed,  characteristics  may  be  the  same.  A  tremendous 
amount  of  research  has  been  applied  to  spectral  signatures  and  discrimination  techniques. 


INCREASE  FUNCTIONAL  UTILITY  VIA  MULTI-SENSOR 


FUNCTIONS 

WIDE  AREA  SEARCH/ACQUISITION 

RECOGNITION 

RANGING 

TRACKING 

MOVING  TARGET  INDICATION 
NAVIGATION 

TERRAIN  FOLLOWING/AVOIDANCE 
WIRE/OBSTACLE  AVOIDANCE 
WEAPON  DELIVERY/GUIDANCE 


MMW  RADAR  EU£  CO2  LASER 


X 

X 

X 

X 

X 


X 

X 


X  X 


X 

X 

X 

X 

X 

X 

X 

X 


DAY/NIGHT 
ADVERSE  WEATHER 
ALL  WEATHER 
SMOKE  PENETRATION 
COVERT 

PROJECTED  MUNITION  COMPATIBILITY 
PROCESSING  COMPATIBILITY 


X  X 

X 
X 
X 

X 

X  X 

WITH  COo 
LASER  RADAR 


X 

X 

X 

QUASI 

X 

WITH 

MMW  RADAR 


Most  of  che  research  wavebands  in  the  visible  and  air  infrared  electro-optic 
spectrum.  With  the  increased  emphasis  on  decoy  discrimination  increased  range  and 
improved  dust/smoke  penetration,  the  spectrum  of  interest  for  multiple  wavelength  imaging 
has  been  extended  to  the  millimeter  and  radar  wavebands.  Table  I  illustrates  the  utility 
of  the  millimeter  and  active  and  passive  thermal  bands  in  performing  various  functions 
under  various  environmental  conditions.  The  trade-off  of  ''optimum"  sensor  mix  also 
depends  on  the  state  of  the  development  of  the  sensor  technology.  The  associated 
processing  available  and  the  scenario  selected.  Table  11  rank  orders  sensor  suits  for  the 
European  Theater.  Again,  the  ranking  did  not  consider  the  state  of  development  of  the 
sensor  and  sensor  integration/cueing  technology.  Currently,  the  Avionics  Laboratory  is 
pursuing  the  development  of  advanced  sensors  in  the  millimeter,  and  active  and  passive 
wavebands . 
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TABLE  II 

RECOMMENDED  SENSOR  SUITE  PERFORMANCE  RANKING  FOR  EUROPEAN  THEATER  (U> 


GCI/SAM 

ARMOR 

AIRCRAFT 

OVERALL  PERFORMANCE 

SENSOR  SUITES 

RANKING 

'  RANKING 

RANKING 

RANKING 

ACTIVE/PASSIVE 

NEAR  IR 

A 

A 

2 

3 

PASSIVE  IR  FLIR 

7 

A 

2 

7 

PASSIVE/ACTIVE  IR 

5 

2 

3 

2 

mwjm  IR 

6 

1 

A 

1 

HIGH  RESOLUTION  SAR 

8 

5 

1 

5 

SAR  AND  EM  EMISSIONS 

1 

6 

5 

6 

i  wr* 

2 

7 

5 

8 

MMLave 

3 

3 

5 

A 

In  parallel,  a  Targeting  Systems  Characterization  Facility  has  been  established.  This 
in-house  Avionics  Laboratory  effort  operates;  measures  and  models  the  performance  of 
targeting  sensors  under  poor  weather  conditions.  The  ability  of  various  sensors  to 
locate  and  identify  military  targets  is  evaluated  over  a  calibrated  atmospheric  path. 

The  atmospheric  data  is  also  made  available  to  update  and  improve  atmospheric  models 
such  as  the  Air  Force  Geophysics  Laboratory  LOWTRAN  atmospheric  propagation  model. 
Simultaneous  EO  Sensor/Atmospheric  Correlation  and  Atmospheric  Transmission  is  measured 
in  the  visual,  IR  bands,  10.6  microns,  and  in  the  95  GHz  band.  The  parallel  improvement 
of  sensors,  processors  and  actual  simultaneous  multi-wavelength  measurements  allow 
optimization  of  multispectral  sensor  suits. 


TABLE  III 

EFFORTS  IN  CUEING.  AUTOMATIC  SCREENING  AND  TARGET  CLASSIFICATION 


WAVEBAND 

ACTIVE  IR  (3-D) 
PASSIVE  IR 

MM 

FIXED  TARGET  RADAR 
MOVING  TARGET  RADAR 


ONGOING 

iilf  ! 
1 
0 
3 
3 
3 


ENT 


1 

A 

0 

1 

0 


PLANNED 


0 

0 

0 

1 

0 


2 

1 

1 

5 

0 


\ 
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CONCLUSION 

Table  III  summarizes  the  efforts  ongoing  or  planned  by  the  Reconnaissance  and  Weapon 
Delivery  Division  of  the  Avionics  Laboratory.  The  exploratory  development  efforts  are 
attempting  to  prove  the  utility  of  the  fundamental  technology.  The  advanced  development 
activity  evaluates  the  technology  in  high  performance  military  environment.  Table  III 
indicates  trends  in  the  state  of  maturity  of  the  various  technologies.  Efforts  in 
processing  passive  infrared  imagery  are  the  most  mature. 

The  processing  technology  is  key  to  allowing  pilots  in  single  seat  aircraft  to  deal 
with  the  target  acquisition  task  loading.  In  a  target  rich  environment,  pilots  need  to 
know  which  targets  are  threats,  especially  in  a  defense  suppression  mission.  The  improved 
automatic  screening  performance  is  obtained  through  the  integration  of  improved  target 
acquisition  sensors  and  associated  processing.  The  integrated  screener  technology  offers 
the  following  improvements: 

-  Reduced  aircrew  workload 

-  Automatic  nomination  of  correct  targets 

-  More  survivable  flight  profiles 

-  Correct  weapons  against  each  target 

-  Targeting  decisions  at  longer  ranges 

Rapid  mobility  of  modem  military  forces  dictate  two  fundamental  requirements  for 
reconnaissance  operations:  timeliness  and  completeness.  Timely  targeting  information  is 
needed  for  command  decisions  and  reconnaissance  operations  are  incomplete  unless  information 
is  accurate  and  available  during  day/night  and  all  weather  conditions.  Timeliness  and 
completeness  are  opposing  attributes  in  that  rapid  access  implies  a  limited  quantity  of 
highly  selective  data,  whereas  comprehensive  coverage  under  diverse  conditions,  with  a 
high  probability  of  success,  implies  a  large  amount  of  data.  The  approach  being  taken 
by  the  Avionics  Laboratory  to  reduce  this  dichotomy  is  to  employ  technological  advance¬ 
ments  in  sensor  performance,  automatic  data  processing,  automatic  cueing  and  classification 
sensors,  and  effective  correlation  and  data  handling  techniques.  In  addition,  a  sensor  or 
processing  concept  is  not  considered  viable  unless  it  is  affordable  and  compatible  with 
operational  systems  and  concepts. 
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